Thursday, January 10, 2013

Database Availability Groups 2010\2013


Database Availability Groups:


A database availability group (DAG) is the base component of the high availability and site resilience framework built into Microsoft Exchange Server 2013. A DAG is a group of up to 16 Mailbox servers that hosts a set of databases and provides automatic database-level recovery from failures that affect individual servers or databases.
A DAG is a boundary for mailbox database replication, database and server switchovers, failovers, and an internal component called Active Manager. Active Manager, which runs on every server in a DAG, manages switchovers and failovers.
Any server in a DAG can host a copy of a mailbox database from any other server in the DAG. When a server is added to a DAG, it works with the other servers in the DAG to provide automatic recovery from failures that affect mailbox databases, such as a disk failure or server failure.




To illustrate how a DAG can provide high availability for your mailbox databases, consider the following example, which uses a DAG with five members. This DAG is illustrated in the following figure.
DAG with five members



Database Availability Group
In the preceding figure, the green databases are active mailbox database copies and the blue databases are passive mailbox database copies. In this example, the database copies aren't mirrored across each server, but rather spread across multiple servers. This ensures that no two servers in the DAG have the same set of database copies, providing the DAG with greater resilience to failures, including failures that occur while other components are unavailable as a result of regular maintenance.
Consider the following scenario, using the preceding example DAG, which illustrates resilience to multiple database and server failures.
Initially, all databases and servers are healthy. You need to install some operating system updates on EX2, so you put the server into maintenance mode. This causes a server switchover, which activates the copy of DB4 on another Mailbox server. A server switchover moves all active mailbox database copies from their current server to one or more other Mailbox servers in the DAG in preparation for a scheduled outage for the current server. In this example, there's only one active mailbox database on EX2 (DB4), so only one active mailbox database copy is moved.
DAG with a server offline for maintenance

Database Availability Group with a Server Offline
While you perform maintenance on EX2, EX3 experiences a catastrophic hardware failure and goes offline. Prior to going offline, EX3 hosted the active copy of DB2. To recover from the failure, the system automatically activates the copy of DB2 that's hosted on EX1 within 30 seconds. This is illustrated in the following figure.
DAG with a server offline for maintenance and a failed server

DAG with a server offline and a failed server
After the scheduled maintenance is completed for EX2, you bring the server online and take it out of maintenance mode. As soon as EX2 is available, the other members of the DAG are notified, and the copies of DB1, DB4, and DB5 hosted on EX2 are automatically synchronized with the active copy of each database. This is illustrated in the following figure.
DAG with a restored server synchronizing its database copies


DAG with restored server resynchronizing databases
After the failed hardware component in EX3 is replaced with a new component, EX3 is brought online. After EX3 is available, the other members of the DAG are notified, and the copies of DB2, DB3, and DB4 hosted on EX3 are automatically synchronized with the active copy of each database. This is illustrated in the following figure.
DAG with a repaired server synchronizing its database copies
DAG with Member Resynchronizing Database Copies




In addition to providing high availability within a datacenter, a DAG can also be extended to one or more datacenters in a configuration that provides site resilience for one or multiple datacenters. In the preceding example figures, the DAG is located in a single datacenter and single Active Directory site. Incremental deployment can be used to extend this DAG to a second datacenter (and a second Active Directory site) by deploying a Mailbox server and the necessary supporting resources (one or more Active Directory servers, and DNS services). The Mailbox server is then added to the DAG, as illustrated in the following figure.
DAG extended across two Active Directory sites

DAG extended across two Active Directory sites
In this example, a passive copy of each active database in the Redmond datacenter is configured on EX6 in the Dublin datacenter. However, there are many other examples of DAG configurations that provide site resilience. For example:
  • Instead of hosting only passive database copies, EX6 could host all active copies, or it could host a mixture of active and passive copies.
  • In addition to EX6, multiple DAG members could be deployed in the Dublin datacenter, providing protection against additional failures. This configuration also provides additional capacity, so that if the Redmond datacenter fails, the Dublin datacenter can support a much larger user population.






In the preceding example, a single DAG extends across multiple datacenters, providing site resilience for either or both datacenters. When using a single DAG to provide site resilience in an environment where each datacenter to which you extend the DAG has an active user population, there is a single point of failure in the wide area network (WAN) connection. This is because quorum requires a majority of the voters to be active and able to communicate with each other.
In the preceding example, the majority of voters are located in the Redmond datacenter. If the Dublin datacenter hosts active mailbox databases, and it has a local user population, a WAN outage would result in a messaging service outage for the Dublin users. When WAN connectivity breaks, only the DAG members in the Redmond datacenter retain quorum and continue providing messaging service.
To eliminate the WAN as a single point of failure when you need to provide site resilience for multiple datacenters that each have an active user population, you should deploy multiple DAGs, where each DAG has a majority of voters in a separate datacenter. When a WAN outage occurs, replication will be blocked until connectivity is restored. Users will have messaging service, because each DAG continues to service its local user population.


No comments:

Post a Comment