Skip to content

Commit fd0971b

Browse files
authored
Merge pull request #32825 from WilliamDAssafMSFT/20250113-hs-zr
20250113 HA ZR
2 parents c56fd6e + cd347b8 commit fd0971b

2 files changed

Lines changed: 57 additions & 22 deletions

File tree

azure-sql/database/failover-group-sql-db.md

Lines changed: 57 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,10 @@
11
---
22
title: Failover groups overview & best practices
33
description: Failover groups let you manage geo-replication and automatic / coordinated failover of a group of databases on a server for both single and pooled database in Azure SQL Database.
4-
author: rajeshsetlem
5-
ms.author: rsetlem
6-
ms.reviewer: wiassaf, mathoma
7-
ms.date: 09/30/2024
4+
author: WilliamDAssafMSFT
5+
ms.author: wiassaf
6+
ms.reviewer: rsetlem, mathoma
7+
ms.date: 01/13/2025
88
ms.service: azure-sql-database
99
ms.subservice: high-availability
1010
ms.topic: conceptual
@@ -21,26 +21,28 @@ ms.custom:
2121
2222
The failover groups feature allows you to manage the replication and failover of some or all databases on a [logical server](logical-servers.md) to a logical server in another region. This article provides an overview of the failover group feature with best practices and recommendations for using it with Azure SQL Database.
2323

24-
To get started using the feature, review [Configure failover group](failover-group-configure-sql-db.md).
24+
To get started using the feature, review [Configure a failover group for Azure SQL Database](failover-group-configure-sql-db.md).
2525

2626
> [!NOTE]
27-
> This article covers failover groups for Azure SQL Database. For Azure SQL Managed Instance, see [Failover groups in Azure SQL Managed Instance](../managed-instance/failover-group-sql-mi.md).
27+
> This article covers failover groups for Azure SQL Database. For Azure SQL Managed Instance, see [Failover groups overview & best practices - Azure SQL Managed Instance](../managed-instance/failover-group-sql-mi.md).
2828
2929
To learn more about Azure SQL Database disaster recovery, watch this video:
3030

3131
> [!VIDEO https://learn-video.azurefd.net/vod/player?id=f02c00dc-de7d-4d27-a087-41c562c214d7]
3232
3333
## Overview
3434

35-
The failover groups feature allows you to manage the replication and failover of databases to another Azure region. You can choose all, or a subset of, user databases in a logical server to be replicated to another logical server. It's a declarative abstraction on top of the [active geo-replication](../database/active-geo-replication-overview.md) feature, designed to simplify deployment and management of geo-replicated databases at scale.
35+
The failover groups feature allows you to manage the replication and failover of databases to another Azure region. You can choose all, or a subset of, user databases in a logical server to be replicated to another logical server. It's a declarative abstraction on top of the [active geo-replication](active-geo-replication-overview.md) feature, designed to simplify deployment and management of geo-replicated databases at scale.
3636

3737
For geo-failover RPO and RTO, see [overview of business continuity](business-continuity-high-availability-disaster-recover-hadr-overview.md#rto-and-rpo).
3838

3939

4040
[!INCLUDE [failover-groups-overview](../includes/failover-group-overview.md)]
4141

4242

43-
## <a id="terminology-and-capabilities"></a> Terminology and capabilities
43+
<a id="terminology-and-capabilities"></a>
44+
45+
## Terminology and capabilities
4446

4547
<!--
4648
There is some overlap of content in the following articles, be sure to make changes to all if necessary:
@@ -101,11 +103,13 @@ There is some overlap of content in the following articles, be sure to make chan
101103

102104
A failover group in Azure SQL Database can include one or multiple databases, typically used by the same application. A failover group must be configured on the primary server, which connects it to the secondary server in a different Azure region. The failover group can include all or some databases in the primary server. The following diagram illustrates a typical configuration of a geo-redundant cloud application using multiple databases in a failover group:
103105

104-
:::image type="content" source="./media/failover-group-overview/failover-group.png" alt-text="Diagram shows a typical configuration of a geo-redundant cloud application using multiple databases and a failover group.":::
106+
:::image type="content" source="media/failover-group-sql-db/failover-group.png" alt-text="Diagram shows a typical configuration of a geo-redundant cloud application using multiple databases and a failover group.":::
107+
108+
When designing a service with business continuity in mind, follow the general guidelines and best practices outlined in this article. When configuring a failover group, ensure that authentication and network access on the secondary is set up to function correctly after geo-failover, when the geo-secondary becomes the new primary. For details, see [Configure and manage Azure SQL Database security for geo-restore or failover](active-geo-replication-security-configure.md). For more information, see [Designing globally available services using Azure SQL Database](designing-cloud-solutions-for-disaster-recovery.md).
105109

106-
When designing a service with business continuity in mind, follow the general guidelines and best practices outlined in this article. When configuring a failover group, ensure that authentication and network access on the secondary is set up to function correctly after geo-failover, when the geo-secondary becomes the new primary. For details, see [SQL Database security after disaster recovery](active-geo-replication-security-configure.md). For more information, see [Designing cloud solutions for disaster recovery](designing-cloud-solutions-for-disaster-recovery.md).
110+
<a id="using-geo-paired-regions"></a>
107111

108-
## <a id="using-geo-paired-regions"></a> Use paired regions
112+
## Use paired regions
109113

110114
When creating your failover group between the primary and secondary server, use [paired regions](/azure/reliability/cross-region-replication-azure) as failover groups in paired regions have better performance compared to unpaired regions.
111115

@@ -124,53 +128,84 @@ The number of databases within a failover group directly impacts the duration of
124128
- During a Failover (also known as Planned Failover), we ensure that all primary databases are fully synchronized with their secondary and reach a ready state. To avoid overwhelming the control plane, databases are prepared in batches. Therefore, it is highly recommended to limit the number of databases in a failover group.
125129
- In the case of a Forced Failover, the preparation phase is expedited as data synchronization is not initiated. To achieve quicker and predictable failover durations, it might be beneficial to keep the number of databases in the failover group to a smaller number.
126130

127-
## <a id="using-one-or-several-failover-groups-to-manage-failover-of-multiple-databases"></a> Use multiple failover groups to fail over multiple databases
131+
<a id="using-one-or-several-failover-groups-to-manage-failover-of-multiple-databases"></a>
132+
133+
## Use multiple failover groups to fail over multiple databases
128134

129135
One or many failover groups can be created between two servers in different regions (primary and secondary servers). Each group can include one or several databases that are recovered as a unit in case all or some primary databases become unavailable due to an outage in the primary region. Creating a failover group creates geo-secondary databases with the same service objective as the primary. If you add an existing geo-replication relationship to a failover group, make sure the geo-secondary is configured with the same service tier and compute size as the primary.
130136

131-
## <a id="using-read-write-listener-for-oltp-workload"></a> Use the read-write listener (primary)
137+
<a id="using-read-write-listener-for-oltp-workload"></a>
138+
139+
## Use the read-write listener (primary)
132140

133141
For read-write workloads, use `<fog-name>.database.windows.net` as the server name in the connection string. Connections are automatically directed to the primary. This name doesn't change after failover. Note the failover involves updating the DNS record so the client connections are redirected to the new primary only after the client DNS cache is refreshed. The time to live (TTL) of the primary and secondary listener DNS record is 30 seconds.
134142

135-
## <a id="using-read-only-listener-for-read-only-workload"></a> Use the read-only listener (secondary)
143+
<a id="using-read-only-listener-for-read-only-workload"></a>
144+
145+
## Use the read-only listener (secondary)
136146

137147
If you have logically isolated read-only workloads that are tolerant to data latency, you can run them on the geo-secondary. For read-only sessions, use `<fog-name>.secondary.database.windows.net` as the server name in the connection string. Connections are automatically directed to the geo-secondary. It's also recommended that you indicate read intent in the connection string by using `ApplicationIntent=ReadOnly`.
138148

139149
In the Premium, Business Critical, and Hyperscale service tiers, SQL Database supports the use of [read-only replicas](read-scale-out.md) to offload read-only query workloads, using the `ApplicationIntent=ReadOnly` parameter in the connection string. When you have configured a geo-secondary, you can use this capability to connect to either a read-only replica in the primary location or in the geo-secondary location:
140150

141151
To connect to a read-only replica in the secondary location, use `ApplicationIntent=ReadOnly` and `<fog-name>.secondary.database.windows.net`.
142152

143-
## <a id="preparing-for-performance-degradation"></a> Potential performance degradation after failover
153+
<a id="preparing-for-performance-degradation"></a>
154+
155+
## Potential performance degradation after failover
144156

145157
A typical Azure application uses multiple Azure services and consists of multiple components. Failover of a group is triggered based on the state of Azure SQL Database alone. Other Azure services in the primary region might not be affected by the outage and their components might still be available in that region. Once the primary databases switch to the secondary (DR) region, latency between dependent components can increase. To avoid the impact of higher latency on the application's performance, ensure the redundancy of all the application's components in the DR region, follow these [network security guidelines](failover-group-configure-sql-db.md#failover-groups-and-network-security), and orchestrate the geo-failover of relevant application components together with the database.
146158

147-
## <a id="preparing-for-data-loss"></a> Potential data loss after forced failover
159+
<a id="preparing-for-data-loss"></a>
160+
161+
## Potential data loss after forced failover
148162

149163
If an outage occurs in the primary region, recent transactions might not have been replicated to the geo-secondary and there might be data loss if a forced failover is performed.
150164

151165
> [!IMPORTANT]
152166
> Elastic pools with 800 or fewer DTUs or 8 or fewer vCores, and more than 250 databases can encounter issues including longer planned geo-failovers and degraded performance. These issues are more likely to occur for write intensive workloads when geo-replicas are widely separated by geography, or when multiple secondary geo-replicas are used for each database. A symptom of these issues is an increase in geo-replication lag over time, potentially leading to a more extensive data loss in an outage. This lag can be monitored using [sys.dm_geo_replication_link_status](/sql/relational-databases/system-dynamic-management-views/sys-dm-geo-replication-link-status-azure-sql-database). If these issues occur, then mitigation includes scaling up the pool to have more DTUs or vCores, or reducing the number of geo-replicated databases in the pool.
153167
154168

155-
## <a id="failback"></a> Failback
169+
<a id="failback"></a>
170+
171+
## Failback
156172

157173
When failover groups are configured with a Microsoft-managed failover policy, then forced failover to the geo-secondary server is initiated during a disaster scenario as per the defined grace period. Failback to the old primary must be initiated manually.
158174

159175
## Permissions and limitations
160176

161177
Review the configure failover group guide for a list of [permissions](failover-group-configure-sql-db.md#permissions) and [limitations](failover-group-configure-sql-db.md#limitations).
162178

179+
<a id="programmatically-managing-failover-groups"></a>
180+
181+
## Programmatically manage failover groups
182+
183+
Failover groups can also be managed programmatically using Azure PowerShell, Azure CLI, and REST API. For more information, review [Configure a failover group for Azure SQL Database](failover-group-configure-sql-db.md).
184+
185+
## Enable high availability (zone redundancy)
186+
187+
[Availability through redundancy](high-availability-sla-local-zone-redundancy.md) improves resiliency further by protecting against outages of an availability zone within a region.
188+
189+
When creating a failover group that includes one or more databases, there is no option to enable high availability for the secondary databases, regardless of the high availability settings of the primary databases.
190+
191+
### Zone redundancy with non-Hyperscale databases
192+
193+
Secondary databases created through the failover group **will not** have high availability enabled by default. After the failover group is created, enable high availability on the databases contained within the group. This behavior also applies if you create Active Geo-Replication first and then optionally add the databases to a failover group.
194+
195+
### Zone redundancy with Hyperscale
196+
197+
Secondary databases created through the failover group **will** inherit the high availability settings of their respective primary databases. Therefore, if the primary database has high availability enabled, the secondary database will also have it enabled. Conversely, if the primary database does not have high availability enabled, the secondary database will not have it enabled either.
163198

164-
## <a id="programmatically-managing-failover-groups"></a> Programmatically manage failover groups
199+
### Regional support for availability zones
165200

166-
Failover groups can also be managed programmatically using Azure PowerShell, Azure CLI, and REST API. For more information, review [Configure failover group](failover-group-configure-sql-db.md).
201+
In a scenario where high availability is enabled on the primary database, and the secondary database being added is in a region that does not yet support availability zones, the workflow will fail with an error message with code 45122: "Create or update Failover Group operation successfully completed; however, some of the databases could not be added to or removed from Failover Group. Provisioning of zone redundant database/pool is not supported for your current request." To work around this issue, use [Active geo-replication](active-geo-replication-overview.md) where you enable or disable high availability while creating the secondary database. You can then optionally add these databases to a failover group.
167202

168203
## Related content
169204

170205
- For sample scripts, see:
171-
- [Use PowerShell to configure active geo-replication for Azure SQL Database](scripts/setup-geodr-and-failover-database-powershell.md)
172-
- [Use PowerShell to configure active geo-replication for a pooled database in Azure SQL Database](scripts/setup-geodr-and-failover-elastic-pool-powershell.md)
173-
- [Use PowerShell to add an Azure SQL Database to a failover group](scripts/add-database-to-failover-group-powershell.md)
206+
- [Use PowerShell to configure active geo-replication for Azure SQL Database](scripts/setup-geodr-and-failover-database-powershell.md)
207+
- [Use PowerShell to configure active geo-replication for a pooled database in Azure SQL Database](scripts/setup-geodr-and-failover-elastic-pool-powershell.md)
208+
- [Use PowerShell to add an Azure SQL Database to a failover group](scripts/add-database-to-failover-group-powershell.md)
174209
- For a business continuity overview and scenarios, see [Business continuity overview](business-continuity-high-availability-disaster-recover-hadr-overview.md)
175210
- To learn about Azure SQL Database automated backups, see [SQL Database automated backups](automated-backups-overview.md).
176211
- To learn about using automated backups for recovery, see [Restore a database from the service-initiated backups](recovery-using-backups.md).

azure-sql/database/media/failover-group-overview/failover-group.png renamed to azure-sql/database/media/failover-group-sql-db/failover-group.png

File renamed without changes.

0 commit comments

Comments
 (0)