You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: azure-sql/database/failover-group-sql-db.md
+57-22Lines changed: 57 additions & 22 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,10 +1,10 @@
1
1
---
2
2
title: Failover groups overview & best practices
3
3
description: Failover groups let you manage geo-replication and automatic / coordinated failover of a group of databases on a server for both single and pooled database in Azure SQL Database.
4
-
author: rajeshsetlem
5
-
ms.author: rsetlem
6
-
ms.reviewer: wiassaf, mathoma
7
-
ms.date: 09/30/2024
4
+
author: WilliamDAssafMSFT
5
+
ms.author: wiassaf
6
+
ms.reviewer: rsetlem, mathoma
7
+
ms.date: 01/13/2025
8
8
ms.service: azure-sql-database
9
9
ms.subservice: high-availability
10
10
ms.topic: conceptual
@@ -21,26 +21,28 @@ ms.custom:
21
21
22
22
The failover groups feature allows you to manage the replication and failover of some or all databases on a [logical server](logical-servers.md) to a logical server in another region. This article provides an overview of the failover group feature with best practices and recommendations for using it with Azure SQL Database.
23
23
24
-
To get started using the feature, review [Configure failover group](failover-group-configure-sql-db.md).
24
+
To get started using the feature, review [Configure a failover group for Azure SQL Database](failover-group-configure-sql-db.md).
25
25
26
26
> [!NOTE]
27
-
> This article covers failover groups for Azure SQL Database. For Azure SQL Managed Instance, see [Failover groups in Azure SQL Managed Instance](../managed-instance/failover-group-sql-mi.md).
27
+
> This article covers failover groups for Azure SQL Database. For Azure SQL Managed Instance, see [Failover groups overview & best practices - Azure SQL Managed Instance](../managed-instance/failover-group-sql-mi.md).
28
28
29
29
To learn more about Azure SQL Database disaster recovery, watch this video:
The failover groups feature allows you to manage the replication and failover of databases to another Azure region. You can choose all, or a subset of, user databases in a logical server to be replicated to another logical server. It's a declarative abstraction on top of the [active geo-replication](../database/active-geo-replication-overview.md) feature, designed to simplify deployment and management of geo-replicated databases at scale.
35
+
The failover groups feature allows you to manage the replication and failover of databases to another Azure region. You can choose all, or a subset of, user databases in a logical server to be replicated to another logical server. It's a declarative abstraction on top of the [active geo-replication](active-geo-replication-overview.md) feature, designed to simplify deployment and management of geo-replicated databases at scale.
36
36
37
37
For geo-failover RPO and RTO, see [overview of business continuity](business-continuity-high-availability-disaster-recover-hadr-overview.md#rto-and-rpo).
## <aid="terminology-and-capabilities"></a> Terminology and capabilities
43
+
<aid="terminology-and-capabilities"></a>
44
+
45
+
## Terminology and capabilities
44
46
45
47
<!--
46
48
There is some overlap of content in the following articles, be sure to make changes to all if necessary:
@@ -101,11 +103,13 @@ There is some overlap of content in the following articles, be sure to make chan
101
103
102
104
A failover group in Azure SQL Database can include one or multiple databases, typically used by the same application. A failover group must be configured on the primary server, which connects it to the secondary server in a different Azure region. The failover group can include all or some databases in the primary server. The following diagram illustrates a typical configuration of a geo-redundant cloud application using multiple databases in a failover group:
103
105
104
-
:::image type="content" source="./media/failover-group-overview/failover-group.png" alt-text="Diagram shows a typical configuration of a geo-redundant cloud application using multiple databases and a failover group.":::
106
+
:::image type="content" source="media/failover-group-sql-db/failover-group.png" alt-text="Diagram shows a typical configuration of a geo-redundant cloud application using multiple databases and a failover group.":::
107
+
108
+
When designing a service with business continuity in mind, follow the general guidelines and best practices outlined in this article. When configuring a failover group, ensure that authentication and network access on the secondary is set up to function correctly after geo-failover, when the geo-secondary becomes the new primary. For details, see [Configure and manage Azure SQL Database security for geo-restore or failover](active-geo-replication-security-configure.md). For more information, see [Designing globally available services using Azure SQL Database](designing-cloud-solutions-for-disaster-recovery.md).
105
109
106
-
When designing a service with business continuity in mind, follow the general guidelines and best practices outlined in this article. When configuring a failover group, ensure that authentication and network access on the secondary is set up to function correctly after geo-failover, when the geo-secondary becomes the new primary. For details, see [SQL Database security after disaster recovery](active-geo-replication-security-configure.md). For more information, see [Designing cloud solutions for disaster recovery](designing-cloud-solutions-for-disaster-recovery.md).
110
+
<aid="using-geo-paired-regions"></a>
107
111
108
-
## <aid="using-geo-paired-regions"></a> Use paired regions
112
+
## Use paired regions
109
113
110
114
When creating your failover group between the primary and secondary server, use [paired regions](/azure/reliability/cross-region-replication-azure) as failover groups in paired regions have better performance compared to unpaired regions.
111
115
@@ -124,53 +128,84 @@ The number of databases within a failover group directly impacts the duration of
124
128
- During a Failover (also known as Planned Failover), we ensure that all primary databases are fully synchronized with their secondary and reach a ready state. To avoid overwhelming the control plane, databases are prepared in batches. Therefore, it is highly recommended to limit the number of databases in a failover group.
125
129
- In the case of a Forced Failover, the preparation phase is expedited as data synchronization is not initiated. To achieve quicker and predictable failover durations, it might be beneficial to keep the number of databases in the failover group to a smaller number.
126
130
127
-
## <aid="using-one-or-several-failover-groups-to-manage-failover-of-multiple-databases"></a> Use multiple failover groups to fail over multiple databases
## Use multiple failover groups to fail over multiple databases
128
134
129
135
One or many failover groups can be created between two servers in different regions (primary and secondary servers). Each group can include one or several databases that are recovered as a unit in case all or some primary databases become unavailable due to an outage in the primary region. Creating a failover group creates geo-secondary databases with the same service objective as the primary. If you add an existing geo-replication relationship to a failover group, make sure the geo-secondary is configured with the same service tier and compute size as the primary.
130
136
131
-
## <aid="using-read-write-listener-for-oltp-workload"></a> Use the read-write listener (primary)
For read-write workloads, use `<fog-name>.database.windows.net` as the server name in the connection string. Connections are automatically directed to the primary. This name doesn't change after failover. Note the failover involves updating the DNS record so the client connections are redirected to the new primary only after the client DNS cache is refreshed. The time to live (TTL) of the primary and secondary listener DNS record is 30 seconds.
134
142
135
-
## <aid="using-read-only-listener-for-read-only-workload"></a> Use the read-only listener (secondary)
If you have logically isolated read-only workloads that are tolerant to data latency, you can run them on the geo-secondary. For read-only sessions, use `<fog-name>.secondary.database.windows.net` as the server name in the connection string. Connections are automatically directed to the geo-secondary. It's also recommended that you indicate read intent in the connection string by using `ApplicationIntent=ReadOnly`.
138
148
139
149
In the Premium, Business Critical, and Hyperscale service tiers, SQL Database supports the use of [read-only replicas](read-scale-out.md) to offload read-only query workloads, using the `ApplicationIntent=ReadOnly` parameter in the connection string. When you have configured a geo-secondary, you can use this capability to connect to either a read-only replica in the primary location or in the geo-secondary location:
140
150
141
151
To connect to a read-only replica in the secondary location, use `ApplicationIntent=ReadOnly` and `<fog-name>.secondary.database.windows.net`.
142
152
143
-
## <aid="preparing-for-performance-degradation"></a> Potential performance degradation after failover
153
+
<aid="preparing-for-performance-degradation"></a>
154
+
155
+
## Potential performance degradation after failover
144
156
145
157
A typical Azure application uses multiple Azure services and consists of multiple components. Failover of a group is triggered based on the state of Azure SQL Database alone. Other Azure services in the primary region might not be affected by the outage and their components might still be available in that region. Once the primary databases switch to the secondary (DR) region, latency between dependent components can increase. To avoid the impact of higher latency on the application's performance, ensure the redundancy of all the application's components in the DR region, follow these [network security guidelines](failover-group-configure-sql-db.md#failover-groups-and-network-security), and orchestrate the geo-failover of relevant application components together with the database.
146
158
147
-
## <aid="preparing-for-data-loss"></a> Potential data loss after forced failover
159
+
<aid="preparing-for-data-loss"></a>
160
+
161
+
## Potential data loss after forced failover
148
162
149
163
If an outage occurs in the primary region, recent transactions might not have been replicated to the geo-secondary and there might be data loss if a forced failover is performed.
150
164
151
165
> [!IMPORTANT]
152
166
> Elastic pools with 800 or fewer DTUs or 8 or fewer vCores, and more than 250 databases can encounter issues including longer planned geo-failovers and degraded performance. These issues are more likely to occur for write intensive workloads when geo-replicas are widely separated by geography, or when multiple secondary geo-replicas are used for each database. A symptom of these issues is an increase in geo-replication lag over time, potentially leading to a more extensive data loss in an outage. This lag can be monitored using [sys.dm_geo_replication_link_status](/sql/relational-databases/system-dynamic-management-views/sys-dm-geo-replication-link-status-azure-sql-database). If these issues occur, then mitigation includes scaling up the pool to have more DTUs or vCores, or reducing the number of geo-replicated databases in the pool.
153
167
154
168
155
-
## <aid="failback"></a> Failback
169
+
<aid="failback"></a>
170
+
171
+
## Failback
156
172
157
173
When failover groups are configured with a Microsoft-managed failover policy, then forced failover to the geo-secondary server is initiated during a disaster scenario as per the defined grace period. Failback to the old primary must be initiated manually.
158
174
159
175
## Permissions and limitations
160
176
161
177
Review the configure failover group guide for a list of [permissions](failover-group-configure-sql-db.md#permissions) and [limitations](failover-group-configure-sql-db.md#limitations).
Failover groups can also be managed programmatically using Azure PowerShell, Azure CLI, and REST API. For more information, review [Configure a failover group for Azure SQL Database](failover-group-configure-sql-db.md).
184
+
185
+
## Enable high availability (zone redundancy)
186
+
187
+
[Availability through redundancy](high-availability-sla-local-zone-redundancy.md) improves resiliency further by protecting against outages of an availability zone within a region.
188
+
189
+
When creating a failover group that includes one or more databases, there is no option to enable high availability for the secondary databases, regardless of the high availability settings of the primary databases.
190
+
191
+
### Zone redundancy with non-Hyperscale databases
192
+
193
+
Secondary databases created through the failover group **will not** have high availability enabled by default. After the failover group is created, enable high availability on the databases contained within the group. This behavior also applies if you create Active Geo-Replication first and then optionally add the databases to a failover group.
194
+
195
+
### Zone redundancy with Hyperscale
196
+
197
+
Secondary databases created through the failover group **will** inherit the high availability settings of their respective primary databases. Therefore, if the primary database has high availability enabled, the secondary database will also have it enabled. Conversely, if the primary database does not have high availability enabled, the secondary database will not have it enabled either.
163
198
164
-
##<aid="programmatically-managing-failover-groups"></a> Programmatically manage failover groups
199
+
### Regional support for availability zones
165
200
166
-
Failover groups can also be managed programmatically using Azure PowerShell, Azure CLI, and REST API. For more information, review [Configure failover group](failover-group-configure-sql-db.md).
201
+
In a scenario where high availability is enabled on the primary database, and the secondary database being added is in a region that does not yet support availability zones, the workflow will fail with an error message with code 45122: "Create or update Failover Group operation successfully completed; however, some of the databases could not be added to or removed from Failover Group. Provisioning of zone redundant database/pool is not supported for your current request." To work around this issue, use [Active geo-replication](active-geo-replication-overview.md) where you enable or disable high availability while creating the secondary database. You can then optionally add these databases to a failover group.
167
202
168
203
## Related content
169
204
170
205
- For sample scripts, see:
171
-
-[Use PowerShell to configure active geo-replication for Azure SQL Database](scripts/setup-geodr-and-failover-database-powershell.md)
172
-
-[Use PowerShell to configure active geo-replication for a pooled database in Azure SQL Database](scripts/setup-geodr-and-failover-elastic-pool-powershell.md)
173
-
-[Use PowerShell to add an Azure SQL Database to a failover group](scripts/add-database-to-failover-group-powershell.md)
206
+
-[Use PowerShell to configure active geo-replication for Azure SQL Database](scripts/setup-geodr-and-failover-database-powershell.md)
207
+
-[Use PowerShell to configure active geo-replication for a pooled database in Azure SQL Database](scripts/setup-geodr-and-failover-elastic-pool-powershell.md)
208
+
-[Use PowerShell to add an Azure SQL Database to a failover group](scripts/add-database-to-failover-group-powershell.md)
174
209
- For a business continuity overview and scenarios, see [Business continuity overview](business-continuity-high-availability-disaster-recover-hadr-overview.md)
175
210
- To learn about Azure SQL Database automated backups, see [SQL Database automated backups](automated-backups-overview.md).
176
211
- To learn about using automated backups for recovery, see [Restore a database from the service-initiated backups](recovery-using-backups.md).
0 commit comments