| title | Troubleshoot Always On Availability Groups Configuration (SQL Server) | |||
|---|---|---|---|---|
| description | Troubleshoot typical problems with configuring server instances for Always On availability groups in SQL Server. | |||
| ms.custom | seo-lt-2019 | |||
| ms.date | 05/17/2016 | |||
| ms.prod | sql | |||
| ms.reviewer | ||||
| ms.technology | availability-groups | |||
| ms.topic | troubleshooting | |||
| helpviewer_keywords |
|
|||
| ms.assetid | 8c222f98-7392-4faf-b7ad-5fb60ffa237e | |||
| author | MashaMSFT | |||
| ms.author | mathoma |
[!INCLUDE SQL Server]
This topic provides information to help you troubleshoot typical problems with configuring server instances for [!INCLUDEssHADR]. Typical configuration problems include [!INCLUDEssHADR] is disabled, accounts are incorrectly configured, the database mirroring endpoint does not exist, the endpoint is inaccessible (SQL Server Error 1418), network access does not exist, and a join database command fails (SQL Server Error 35250).
Note
Ensure that you are meeting the [!INCLUDEssHADR] prerequisites. For more information, see Prerequisites, Restrictions, and Recommendations for Always On Availability Groups (SQL Server).
In This Topic:
| Section | Description |
|---|---|
| Always On Availability Groups Is Not Enabled | If an instance of [!INCLUDEssNoVersion] is not enabled for [!INCLUDEssHADR], the instance does not support availability group creation and cannot host any availability replicas. |
| Accounts | Discusses requirements for correctly configuring the accounts under which [!INCLUDEssNoVersion] is running. |
| Endpoints | Discusses how to diagnose issues with the database mirroring endpoint of a server instance. |
| Network access | Documents the requirement that each server instance that is hosting an availability replica must be able to access the port of each of the other server instances over TCP. |
| Endpoint Access (SQL Server Error 1418) | Contains information about this [!INCLUDEssNoVersion] error message. |
| Join Database Fails (SQL Server Error 35250) | Discusses the possible causes and resolution of a failure to join secondary databases to an availability group because the connection to the primary replica is not active. |
| Read-Only Routing is Not Working Correctly | |
| Related Tasks | Contains a list of task-oriented topics in [!INCLUDEssnoversion] Books Online that are particularly relevant to troubleshooting an availability group configuration. |
| Related Content | Contains a list of relevant resources that are external to [!INCLUDEssNoVersion] Books Online. |
The [!INCLUDEssHADR] feature must be enabled on each of the instances of [!INCLUDEssnoversion]. For more information, see Enable and Disable Always On Availability Groups (SQL Server).
The accounts under which [!INCLUDEssNoVersion] is running must be correctly configured.
-
Do the accounts have the correct permissions?
-
If the partners run under the same domain account, the correct user logins exist automatically in both master databases. This simplifies the security configuration and is recommended.
-
If two server instances run under different accounts, the each account must be created in master on the remote server instance, and that login must be granted CONNECT permissions to connect to the database mirroring endpoint of that server instance. For more information, seeSet Up Login Accounts for Database Mirroring or Always On Availability Groups (SQL Server). You can use the following query on each instance to check if the logins have CONNECT permissions:
SELECT perm.class_desc, prin.name, perm.permission_name, perm.state_desc, prin.type_desc as PrincipalType, prin.is_disabled FROM sys.server_permissions perm LEFT JOIN sys.server_principals prin ON perm.grantee_principal_id = prin.principal_id LEFT JOIN sys.tcp_endpoints tep ON perm.major_id = tep.endpoint_id WHERE perm.class_desc = 'ENDPOINT' AND perm.permission_name = 'CONNECT' AND tep.type = 4
-
-
If [!INCLUDEssNoVersion] is running under a built-in account, such as Local System, Local Service, or Network Service, or a nondomain account, you must use certificates for endpoint authentication. If your service accounts are using domain accounts in the same domain, you can choose to grant CONNECT access for each service account on all the replica locations or you can use certificates. For more information, seeUse Certificates for a Database Mirroring Endpoint (Transact-SQL).
Endpoints must be correctly configured.
-
Make sure that each instance of [!INCLUDEssNoVersion] that is going to host an availability replica (each replica location) has a database mirroring endpoint. To determine whether a database mirroring endpoint exists on a given server instance, use the sys.database_mirroring_endpoints catalog view:
SELECT name, state_desc FROM sys.database_mirroring_endpoints
For more information on creating endpoints, see either Create a Database Mirroring Endpoint for Windows Authentication (Transact-SQL) or Allow a Database Mirroring Endpoint to Use Certificates for Outbound Connections (Transact-SQL).
-
Check that the port numbers are correct.
To identify the port currently associated with database mirroring endpoint of a server instance, use the following [!INCLUDEtsql] statement:
SELECT type_desc, port FROM sys.tcp_endpoints; GO
-
For [!INCLUDEssHADR] setup issues that are difficult to explain, we recommend that you inspect each server instance to determine whether it is listening on the correct ports.
-
Make sure that the endpoints are started (STATE=STARTED). On each server instance, use the following [!INCLUDEtsql] statement:
SELECT state_desc FROM sys.database_mirroring_endpointsFor more information about the state_desc column, see sys.database_mirroring_endpoints (Transact-SQL).
To start an endpoint, use the following [!INCLUDEtsql] statement:
ALTER ENDPOINT Endpoint_Mirroring STATE = STARTED AS TCP (LISTENER_PORT = <port_number>) FOR database_mirroring (ROLE = ALL); GOFor more information, see ALTER ENDPOINT (Transact-SQL).
-
Make sure that the login from the other server has CONNECT permission. To determine who has CONNECT permission for an endpoint, on each server instance use the following [!INCLUDEtsql] statement:
SELECT 'Metadata Check'; SELECT EP.name, SP.STATE, CONVERT(nvarchar(38), suser_name(SP.grantor_principal_id)) AS GRANTOR, SP.TYPE AS PERMISSION, CONVERT(nvarchar(46),suser_name(SP.grantee_principal_id)) AS GRANTEE FROM sys.server_permissions SP , sys.endpoints EP WHERE SP.major_id = EP.endpoint_id ORDER BY Permission,grantor, grantee;
-
Ensure correct server name is used in the endpoint URL
For server name in an endpoint URL, it is recommended to use fully qualified domain name (FQDN), although you can use any name that uniquely identifies the machine. The server address can be a Netbios name (if the systems are in the same domain), a fully qualified domain name (FQDN), or an IP address (preferably, a static IP address). Using the fully qualified domain name is the recommended option.
If you have already defined an Endpoint URL, you can query it by using:
select endpoint_url from sys.availability_replicas
Next, compare the endpoint_url output to the server name (Netbios or FQDN). To query the Netbios and FQDN, run the following in a Command Prompt on the replica locally:
hostname & echo %COMPUTERNAME%.%USERDNSDOMAIN%For query the server name of a remote computer, run this from a Command Prompt. Then compare the endpoint_url
ping -a servername_from_endpoint_urlFor more information, see Specify the Endpoint URL When Adding or Modifying an Availability Replica (SQL Server).
Each server instance that is hosting an availability replica must be able to access the port of each of the other server instance over TCP. This is especially important if the server instances are in different domains that do not trust each other (untrusted domains). Check if you can connect to the endpoints by following these steps:
-
Use Telnet to validate connectivity. Here are examples of commands you can use:
telnet ServerName Port telnet IP_Address Port -
If the Endpoint is listening and connection is successful, then you will see a blank screen. If not, you will receive a connection error from Telnet
-
If Telnet connection to the IP address works but to the ServerName it does not, there is likely a DNS or name resolution issue
-
If connection works by ServerName and not by IP address, then there could be more than one endpoint defined on that server (another SQL instance perhaps) that is listening on that port. Though the status of the endpoint on the instance in question shows "STARTED" another instance may actually have the port binding and prevent the correct instance from listening and establishing TCP connections.
-
If Telnet fails to connect, look for Firewall and/or Anti-virus software that may be blocking the endpoint port in question. Check the firewall setting to see if it allows the endpoint port communication between the server instances that host primary replica and the secondary replica (port 5022 by default). Run the following PowerShell script to examine for disabled inbound traffic rules
-
If Telnet fails to connect, look for Firewall and/or antivirus software that may be blocking the endpoint port in question. If you are running [!INCLUDEssNoVersion] on Azure VM, additionally you would need to ensure Network Security Group (NSG) allows the traffic to endpoint port. Check the firewall (and NSG, for Azure VM) setting to see if it allows the endpoint port communication between the server instances that host primary replica and the secondary replica (port 5022 by default)
Get-NetFirewallRule -Action Block -Enabled True -Direction Inbound |Format-Table
-
Capture a NETSTAT -a output and verify the status is a LISTENING or ESTABLISHED on the IP:Port for the endpoint specified
netstat -a
This [!INCLUDEssNoVersion] message indicates that the server network address specified in the endpoint URL cannot be reached or does not exist, and it suggests that you verify the network address name and reissue the command.
This section discusses the possible causes and resolution of a failure to join secondary databases to the availability group because the connection to the primary replica is not active. This is the full error message:
Msg 35250 The connection to the primary replica is not active. The command cannot be processed.
Resolution:
Summary of steps is outlined below.
For detailed step-by-step instructions, please refer to Engine error MSSQLSERVER_35250
- Ensure the endpoint is created and started.
- Check if you can connect to the endpoint via Telnet and ensure no firewall rules are blocking connectivity
- Check for errors in the system. You can query the sys.dm_hadr_availability_replica_states for the last_connect_error_number that may help you diagnose the join issue.
- Ensure the endpoint is defined so it correctly matches the IP/port that AG is using.
- Check whether the network service account has CONNECT permission to the endpoint.
- Check for possible name resolution issues
- Ensure your [!INCLUDEssNoVersion] is running a recent build (preferably the latest build to protect from running into fixed issues.
-
Ensure that you have set up read-only routing by following Configure read-only routing document.
-
Ensure Client Driver Support
The client application must use a client providers that support
ApplicationIntentparameter. See Driver and client connectivity support for availability groups[!NOTE]
If you are connecting to a distributed network name (DNN) Listener, the provider must also supportMultiSubnetFailoverparameter -
Ensure connection string properties are set correctly
For read-only routing to work properly, your client application must use these properties in the connection string:
- A database name that belongs to the AG
- An availability group listener name
- If you are using DNN, you must specify DNN listener name and DNN port number
<DNN name,DNN port>
- If you are using DNN, you must specify DNN listener name and DNN port number
- ApplicationIntent set to ReadOnly
- MultiSubnetFailover set to true is required for Distributed network name (DNN)
This illustrates the connection string for .NET System.Data.SqlClient provider for a virtual network name (VNN) listener:
Server=tcp:VNN_AgListener,1433;Database=AgDb1;ApplicationIntent=ReadOnly;MultiSubnetFailover=True
This illustrates the connection string for .NET System.Data.SqlClient provider for a distributed network name (DNN) listener:
Server=tcp:DNN_AgListener,DNN_Port;Database=AgDb1;ApplicationIntent=ReadOnly;MultiSubnetFailover=True
[!NOTE]
If you are using command line programs like SQLCMD, ensure that you specify the correct switches for server name. For instance, in SQLCMD you must use the upper case -S switch that specifies server name, not the lower case -s switch which is used for column separator.
Example:sqlcmd -S AG_Listener,port -E -d AgDb1 -K ReadOnly -M -
Ensure that the availability group listener is online. To ensure that the availability group listener is online run the following query on the primary replica:
SELECT * FROM sys.dm_tcp_listener_states;
If you find the listener is offline you can attempt to bring it online using a command like this:
ALTER AVAILABILITY GROUP myAG RESTART LISTENER 'AG_Listener'; -
Ensure READ_ONLY_ROUTING_LIST is correctly populated. On Primary replica, ensure that the READ_ONLY_ROUTING_LIST contains only server instances that are hosting readable secondary replicas.
To view the properties of each replica you can run this query and examine the connectivity endpoint (URL) of the read only replica.
SELECT replica_id, replica_server_name, secondary_role_allow_connections_desc, read_only_routing_url FROM sys.availability_replicas;
To view a read-only routing list and compare to the endpoint URL:
SELECT * FROM sys.availability_read_only_routing_lists;
To change a read-only routing list you can use a query like this:
ALTER AVAILABILITY GROUP [AG1] MODIFY REPLICA ON N'COMPUTER02' WITH (PRIMARY_ROLE (READ_ONLY_ROUTING_LIST=('COMPUTER01','COMPUTER02')));
For more information see Configure read-only routing for an availability group - SQL Server Always On
-
Check that READ_ONLY_ROUTING_URL port is open. Ensure that the Windows firewall is not blocking the READ_ONLY_ROUTING_URL port. Configure a Windows Firewall for database engine access on every replica in the read_only_routing_list and any for clients that will be connecting to those replicas.
[!NOTE] If you are running [!INCLUDEssNoVersion] on Azure VM, you must take additional configuration steps. Ensure that the network security group (NSG) of each replica VM allows traffic to the endpoint port and the DNN port, if you are using DNN listener. If you are using VNN listener, you must ensure the load balancer is configured correctly.
-
Ensure that the READ_ONLY_ROUTING_URL (TCP://system-address:port) contains the correct fully-qualified domain name (FQDN) and port number. See:
-
Ensure proper SQL Server Networking configuration in the [!INCLUDEssNoVersion] Configuration Manager.
Verify on every replica in the read_only_routing_list that:
- [!INCLUDEssNoVersion] remote connectivity is enabled
- TCP/IP is enabled
- The IP addresses are configured correctly
[!NOTE] You can quickly verify all of these are properly configured if you can connect from a remote machine to a target secondary replica's [!INCLUDEssNoVersion] instance name using
TCP:SQL_Instancesyntax.
See: Configure a Server to Listen on a Specific TCP Port (SQL Server Configuration Manager) and View or Change Server Properties (SQL Server)
-
Creation and Configuration of Availability Groups (SQL Server)
-
Create a Database Mirroring Endpoint for Windows Authentication (Transact-SQL)
-
Specify the Endpoint URL When Adding or Modifying an Availability Replica (SQL Server)
-
Manually Prepare a Secondary Database for an Availability Group (SQL Server)
-
Troubleshoot a Failed Add-File Operation (Always On Availability Groups)
-
Management of Logins and Jobs for the Databases of an Availability Group (SQL Server)
-
Manage Metadata When Making a Database Available on Another Server Instance (SQL Server)
Transport Security for Database Mirroring and Always On Availability Groups (SQL Server)
Client Network Configuration
Prerequisites, Restrictions, and Recommendations for Always On Availability Groups (SQL Server)