Skip to content

Commit a1471bc

Browse files
authored
Merge pull request #25573 from PiJoCoder/Error194XX_JoPilov_011322
Create doc files for 19421 and 19419
2 parents 1436ad1 + 90ab238 commit a1471bc

5 files changed

Lines changed: 132 additions & 4 deletions

File tree

docs/relational-databases/errors-events/includes/database-engine-events-and-errors-19000-20999.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -90,6 +90,9 @@ ms.topic: include
9090
| 19148 | 10 | No | Unable to initialize SSL support. |
9191
| 19149 | 10 | No | Unable to configure MDAC-compatibility protocol list in registry. |
9292
| 19150 | 10 | No | Unable to open SQL Server Network Interface library configuration key in registry. |
93+
| [19407](../mssqlserver-19407-database-engine-error.md) | 10 | No | The lease between availability group '%.*ls' and the Windows Server Failover Cluster has expired. A connectivity issue occurred between the instance of SQL Server and the Windows Server Failover Cluster. To determine whether the availability group is failing over correctly, check the corresponding availability group resource in the Windows Server Failover Cluster. |
94+
| [19419](../mssqlserver-19419-database-engine-error.md) | 10 | No | Windows Server Failover Cluster did not receive a process event signal from SQL Server hosting availability group '%.*ls' within the lease timeout period. |
95+
| [19421](../mssqlserver-19421-database-engine-error.md) | 10 | No | SQL Server hosting availability group '%.*ls' did not receive a process event signal from the Windows Server Failover Cluster within the lease timeout period. |
9396
| 20001 | 10 | No | There is no nickname for article '%s' in publication '%s'. |
9497
| 20002 | 10 | No | The filter '%s' already exists for article '%s' in publication '%s'. |
9598
| 20003 | 10 | No | Could not generate nickname for '%s'. |

docs/relational-databases/errors-events/mssqlserver-19407-database-engine-error.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ ms.author: jopilov
3030

3131
Error 19407 is raised in the SQL Server error log when the communication between SQL Server and the Windows Server Failover cluster is lost. Typically a corrective action occurs - a failover to another Always On node.
3232

33-
A lease is a time-based communication mechanism that takes place between the SQL Server and the Windows Server Failover Cluster (WSFC) process, specifically the RHS.EXE process. The two processes communicate with each other periodically to ensure the other process is running and responding. This communication takes place using Windows [event objects](/windows/win32/sync/event-objects) and ensures that a failover of the AG resource doesn't occur without the knowledge of the WSFC. If one of the processes doesn't respond to the lease communication based on a predefined lease period, a lease timeout occurs. For detailed information, see [Lease Mechanism](../../database-engine/availability-groups/windows/availability-group-lease-healthcheck-timeout.md). Also see [How It Works: SQL Server AlwaysOn Lease Timeout[(https://techcommunity.microsoft.com/t5/sql-server-support-blog/how-it-works-sql-server-alwayson-lease-timeout/ba-p/317268)
33+
A lease is a time-based communication mechanism that takes place between the SQL Server and the Windows Server Failover Cluster (WSFC) process, specifically the RHS.EXE process. The two processes communicate with each other periodically to ensure the other process is running and responding. This communication takes place using Windows [Event objects](/windows/win32/sync/event-objects) and ensures that a failover of the AG resource doesn't occur without the knowledge of the WSFC. If one of the processes doesn't respond to the lease communication based on a predefined lease period, a lease timeout occurs. For detailed information, see [Lease Mechanism](../../database-engine/availability-groups/windows/availability-group-lease-healthcheck-timeout.md). Also see [How It Works: SQL Server AlwaysOn Lease Timeout](https://techcommunity.microsoft.com/t5/sql-server-support-blog/how-it-works-sql-server-alwayson-lease-timeout/ba-p/317268)
3434

3535
### Causes
3636

@@ -71,7 +71,7 @@ If there are occurrences of low virtual or physical memory on the system, the SQ
7171
- **Process\Working Set** - to check individual processes memory usage
7272
- **Memory\Available MBytes** - to check overall memory usage on the system
7373

74-
Below is a PowerShell script to identify overall memory usage across all process and the available memory on the system. If you would like to get individual processes memory usage, change it `"\Process(_Total)\Working Set"` to `"\Process(*)\Working Set"`.
74+
You can use the following PowerShell script to identify overall memory usage across all process and the available memory on the system. If you would like to get individual processes memory usage, change it `"\Process(_Total)\Working Set"` to `"\Process(*)\Working Set"`.
7575

7676
```powershell
7777
$serverName = $env:COMPUTERNAME
@@ -97,7 +97,7 @@ If there are occurrences of low virtual or physical memory on the system, the SQ
9797

9898
### Reduce or avoid large memory dumps of the SQL Server or cluster process
9999

100-
In some cases SQL Server process may be encountering exceptions, asserts, scheduler issues and so on. In those cases, SQL Server will by default trigger the SQLDumper.exe process to generate a minidump with indirect memory. However, if that dump generation takes a long time, the SQL Server process will stop responding and this may trigger a lease timeout. Frequent causes for a memory dump taking a long time include large memory usage by the process, the I/O subsystem where the dump is written is slow, or the default setting was changed from mini dump to a filtered or full dump. To avoid a lease timeout, do the following on AG systems:
100+
In some cases SQL Server process may be encountering exceptions, asserts, scheduler issues and so on. In those cases, SQL Server will by default trigger the SQLDumper.exe process to generate a minidump with indirect memory. However, if that dump generation takes a long time, the SQL Server process will stop responding which may trigger a lease timeout. Common causes for a memory dump to take a long time include large memory usage by the process, the I/O subsystem where the dump is written is slow, or the default setting was changed from mini dump to a filtered or full dump. To avoid a lease timeout, use the following steps on AG systems:
101101

102102
- Increase session-timeout, for example, 120 seconds for all replicas
103103
- Change the auto failover of all replicas to manual failover
@@ -107,7 +107,7 @@ For more information, see [Impact of dump generation](/troubleshoot/sql/tools/us
107107

108108
### Check virtual machine (VM) configuration for overprovisioning
109109

110-
If you're using a virtual machine, ensure that you aren't overprovisioning or overcommitting CPUs and memory resources. Overprovisioning CPUs or memory may cause the guest OS to run out of resources and show the same problems described above - high CPU and low memory. Frequently if you're viewing things inside the guest OS, you'll have a hard time explaining why you're running out of computing resources because things are happening outside of the virtual machine itself. Overcommitting resources can cause temporary halts of processing, which are likely to cause lease timeouts. For more information on how to address overcommitting, see [Troubleshooting ESX/ESXi virtual machine performance issues (2001003)](https://kb.vmware.com/s/article/2001003) and [Virtualization – Overcommitting memory and how to detect it within the VM](https://techcommunity.microsoft.com/t5/running-sap-applications-on-the/virtualization-8211-overcommitting-memory-and-how-to-detect-it/ba-p/367623).
110+
If you're using a virtual machine, ensure that you aren't overprovisioning or overcommitting CPUs and memory resources. Overprovisioning CPUs or memory may cause the guest OS to run out of resources and show the same problems described earlier - high CPU and low memory. Frequently if you're viewing things inside the guest OS, you'll have a hard time explaining why you're running out of computing resources because things are happening outside of the virtual machine itself. Overcommitting resources can cause temporary halts of processing, which are likely to cause lease timeouts. For more information on how to address overcommitting, see [Troubleshooting ESX/ESXi virtual machine performance issues (2001003)](https://kb.vmware.com/s/article/2001003) and [Virtualization – Overcommitting memory and how to detect it within the VM](https://techcommunity.microsoft.com/t5/running-sap-applications-on-the/virtualization-8211-overcommitting-memory-and-how-to-detect-it/ba-p/367623).
111111

112112
### Check for virtual machine (VM) migration or backup
113113

Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,65 @@
1+
---
2+
title: "MSSQLSERVER_19419"
3+
description: "MSSQLSERVER_19419"
4+
author: pijocoder
5+
ms.author: jopilov
6+
ms.reviewer: pijocoder
7+
ms.date: 01/13/2023
8+
ms.service: sql
9+
ms.subservice: supportability
10+
ms.topic: "reference"
11+
helpviewer_keywords:
12+
- "19419 (Database Engine error)"
13+
---
14+
# MSSQLSERVER_19419
15+
16+
[!INCLUDE [SQL Server](../../includes/applies-to-version/sqlserver.md)]
17+
18+
## Details
19+
20+
| Attribute | Value |
21+
| :--- | :--- |
22+
| Product Name | SQL Server |
23+
| Event ID | 19419 |
24+
| Event Source | MSSQLSERVER |
25+
| Component | SQLEngine |
26+
| Symbolic Name | HADR_AG_LEASE_EXPIRED_WAITING_FOR_RENEW |
27+
| Message Text | Windows Server Failover Cluster did not receive a process event signal from SQL Server hosting availability group '%.*ls' within the lease timeout period. |
28+
29+
## Explanation
30+
31+
Error 19419 is raised in the SQL Server error log when the lease worker on the SQL Server side didn't get scheduled in time to process event signal from the cluster.
32+
Specifically, SQL Server calls [WaitForMultipleObjects()](/windows/win32/api/synchapi/nf-synchapi-waitformultipleobjects) waiting for the Lease timeout event to be set in a signaled state. If the function returns WAIT_OBJECT_0, which indicates success, but by this time the lease has expired, then error 19419 is raised.
33+
34+
A lease is a time-based communication mechanism that takes place between the SQL Server and the Windows Server Failover Cluster (WSFC) process, specifically the RHS.EXE process. The two processes communicate with each other periodically to ensure the other process is running and responding. This communication takes place using Windows [Event objects](/windows/win32/sync/event-objects) and ensures that a failover of the AG resource doesn't occur without the knowledge of the WSFC. If one of the processes doesn't respond to the lease communication based on a predefined lease period, a lease timeout occurs. For detailed information, see [Lease Mechanism](../../database-engine/availability-groups/windows/availability-group-lease-healthcheck-timeout.md). Also see [How It Works: SQL Server AlwaysOn Lease Timeout](https://techcommunity.microsoft.com/t5/sql-server-support-blog/how-it-works-sql-server-alwayson-lease-timeout/ba-p/317268)
35+
36+
This error is related to other lease timeout errors and provides more specific detail for error [MSSQLSERVER_19407](mssqlserver-19407-database-engine-error.md)
37+
38+
### Causes
39+
40+
Since Windows Events are light-weight synchronization objects, there's relatively small number of external factors that affect them negatively. Typical issues that can lead to lease timeout involve system-wide problems. Here's a list of possibilities that can cause lease expiration and cause a restart or failover:
41+
42+
- High CPU usage on the system (close to 100%)
43+
- Out-of-memory conditions - low virtual memory and/or one of the processes is being paged out
44+
- SQL Server process not responding while generating a large memory dump
45+
- WSFC going offline (e.g due to quorum loss)
46+
47+
48+
The most common reason for error 19419 is high CPU, which causes a delay in scheduling the lease worker thread.
49+
50+
## User action
51+
52+
Check the CPU utilization on the server as SQL Server lease worker seems to be starved for CPU resources. The following PowerShell script will allow you to quickly diagnose CPU usage on the system.
53+
54+
```powershell
55+
Get-Counter -Counter "\Processor(_Total)\% Processor Time" -SampleInterval 5 -MaxSamples 30 |
56+
Select-Object -ExpandProperty CounterSamples | Select-Object TimeStamp, Path, CookedValue
57+
```
58+
59+
For detailed troubleshooting, see User action in [MSSQLSERVER_19407](mssqlserver-19407-database-engine-error.md#user-action)
60+
61+
- Troubleshoot high CPU issues
62+
- Troubleshoot low memory issues
63+
- Reduce or avoid large memory dumps of the SQL Server or cluster process
64+
- Check virtual machine (VM) configuration for overprovisioning
65+
- Check for virtual machine (VM) migration or backup causing issues
Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
---
2+
title: "MSSQLSERVER_19421"
3+
description: "MSSQLSERVER_19421"
4+
author: pijocoder
5+
ms.author: jopilov
6+
ms.reviewer: pijocoder
7+
ms.date: 01/13/2023
8+
ms.service: sql
9+
ms.subservice: supportability
10+
ms.topic: "reference"
11+
helpviewer_keywords:
12+
- "19421 (Database Engine error)"
13+
---
14+
# MSSQLSERVER_19421
15+
16+
[!INCLUDE [SQL Server](../../includes/applies-to-version/sqlserver.md)]
17+
18+
## Details
19+
20+
| Attribute | Value |
21+
| :--- | :--- |
22+
| Product Name | SQL Server |
23+
| Event ID | 19421 |
24+
| Event Source | MSSQLSERVER |
25+
| Component | SQLEngine |
26+
| Symbolic Name | HADR_AG_LEASE_RENEWAL_TIMEOUT |
27+
| Message Text | SQL Server hosting availability group '%.*ls' did not receive a process event signal from the Windows Server Failover Cluster within the lease timeout period. |
28+
29+
## Explanation
30+
31+
Error 19421 is raised in the SQL Server error log when the lease helper on the Windows cluster side didn't signal the SQL Server lease worker thread within the pre-defined lease period. Specifically, SQL Server calls [WaitForMultipleObjects()](/windows/win32/api/synchapi/nf-synchapi-waitformultipleobjects) waiting for the Lease timeout event to be set in a signaled state. If the function returns WAIT_TIMEOUT, because it has exceeded the specified Lease interval, then error 19421 is raised.
32+
33+
A lease is a time-based communication mechanism that takes place between the SQL Server and the Windows Server Failover Cluster (WSFC) process, specifically the RHS.EXE process. The two processes communicate with each other periodically to ensure the other process is running and responding. This communication takes place using Windows [Event objects](/windows/win32/sync/event-objects) and ensures that a failover of the AG resource doesn't occur without the knowledge of the WSFC. If one of the processes doesn't respond to the lease communication based on a predefined lease period, a lease timeout occurs. For detailed information, see [Lease Mechanism](../../database-engine/availability-groups/windows/availability-group-lease-healthcheck-timeout.md). Also see [How It Works: SQL Server AlwaysOn Lease Timeout](https://techcommunity.microsoft.com/t5/sql-server-support-blog/how-it-works-sql-server-alwayson-lease-timeout/ba-p/317268)
34+
35+
This error is related to other lease timeout errors and provides more specific detail for error [MSSQLSERVER_19407](mssqlserver-19407-database-engine-error.md)
36+
37+
### Causes
38+
39+
Since Windows Events are light-weight synchronization objects, there's relatively small number of external factors that affect them negatively. Typical issues that can lead to lease timeout involve system-wide problems. Here's a list of possibilities that can cause lease expiration and cause a restart or failover:
40+
41+
- High CPU usage on the system (close to 100%)
42+
- Out-of-memory conditions - low virtual memory and/or one of the processes is being paged out
43+
- SQL Server process not responding while generating a large memory dump
44+
- WSFC going offline (e.g due to quorum loss)
45+
46+
## User action
47+
48+
Check corresponding Availability Group resource in WSFC cluster to see if it reported any errors.
49+
50+
For detailed troubleshooting, see User action in [MSSQLSERVER_19407](mssqlserver-19407-database-engine-error.md#user-action)
51+
52+
- Troubleshoot high CPU issues
53+
- Troubleshoot low memory issues
54+
- Reduce or avoid large memory dumps of the SQL Server or cluster process
55+
- Check virtual machine (VM) configuration for overprovisioning
56+
- Check for virtual machine (VM) migration or backup causing issues

docs/toc.yml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11508,6 +11508,10 @@ items:
1150811508
items:
1150911509
- name: 19407
1151011510
href: relational-databases/errors-events/mssqlserver-19407-database-engine-error.md
11511+
- name: 19419
11512+
href: relational-databases/errors-events/mssqlserver-19419-database-engine-error.md
11513+
- name: 19421
11514+
href: relational-databases/errors-events/mssqlserver-19421-database-engine-error.md
1151111515
- name: Errors 20,000 to 33,999
1151211516
items:
1151311517
- name: 20557

0 commit comments

Comments
 (0)