You have Multi-Subnet Availability Group environment with the following configuration:
- Availability Group Listener with one IP address for each defined subnet and have an “OR” dependency
on each of the IP addresses - SQL Server 2014 with .Net 4.5
- MultiSubnetFailover parameter set to “TRUE” in the connection string
- Primary replica -> Read/Write
- Secondary replicas -> HADR only
Problem:
Some users get timeout errors when trying to connect, some connection timeouts are intermittent.
Solution:
To resolve this issue, change the behaviour of how Availability Group Listener (AGL) registered with DNS. Two parameters (RegisterAllProvidersIP and HostRecordTTL) affect AGL registration with DNS. By modifying these parameters you can transparently change the experience of the client OS in its name resolution caching. Follow these steps:
1. Get cluster resources with the following PowerShell command:
Get-ClusterResource
Note: If the above command is not recognised, import the Failover cluster module using the following command and then execute the above command again.
Import-Module FailoverClusters
Copy “AG Listener Resource Name” from the name column, its resource type will be “Network Name” and its owner group will be the name of your availability group.
2. Check the current configuration of parameters RegisterAllProvidersIP and HostRecordTTL using the following PowerShell command
Get-ClusterResource<AG Listener Resource Name> | Get-ClusterParameter HostRecordTTL, RegisterAllProvidersIP
If value of RegisterAllProvidersIP is “1” (Windows Cluster will register all of the IP addresses the AGL is dependent on, in the DNS) change it to “0” (only the one active IP address is registered in the DNS)
If value of HostRecordTTL is 1200, change it to something lower like 300 (these values are in seconds). HostRecordTTL governs how long (in seconds) before cached DNS entries on a client OS are expired. 1200 seconds is the default value. Client OS cache this value for 20 minutes and query the DNS server again after the cached record expires
.
3. Change parameters values with following PowerShell commands:
Get-ClusterResource<AG Listener Resource Name> | Set-ClusterParameter -Name HostRecordTTL -Value 300
Get-ClusterResource<AG Listener Resource Name> | Set-ClusterParameter -Name RegisterAllProvidersIP -Value 0
These settings will take effect once you bring the AGL offline and then online again, forcing it to re-register with DNS. Caution, this action will bring the availability group offline as well because of its dependency on the AGL and causes outage. To avoid outage, you can temporarily remove this dependency, allowing OFFLINE and re-ONLINE of the AGL without taking the availability group offline.
I will recommend doing this during a maintenance window.
4. Once you have changed the values of RegisterAllProvidersIP and HostRecordTTLparameters, offline and re-online the listener resource to force re-registration with DNS using the following PowerShell command:
Stop-ClusterResource<AG Listener Resource Name>
Start-ClusterResource<AG Listener Resource Name>
5. Verify that the parameters values have changed
Get-ClusterResource<AG Listener Resource Name> | Get-ClusterParameter HostRecordTTL, RegisterAllProvidersIP
6. Force update DNS with the following PowerShell command:
Get-ClusterResource<AG Listener Resource Name> | Update-ClusterNetworkNameResource
I hope that it will fix the issue, if not then continue reading.
7. You may still have issues even after doing all above, because of the following:
- The DNS server that is contacted by the OS cluster when registering or de-registering hostnames may
not be the same DNS server that clients are using to resolve names to IP addresses - DNS replication topology
- How the DNS Aging and Scavenging property is configured
To fix the issues mentioned at step 7, delete host record from all DNS servers manually and make sure that dynamic updating of DNS is allowed within the environment.
To avoid the above hassle, create AGL (Client Access Point) using Windows Failover Cluster Manager while configuring Multi-subnet Availability Group. When a Client Access Point (AGL) is created using Windows Failover Cluster Manager, RegisterAllProvidersIP parameter is set to “0”, only one active IP address is registered with the DNS (the IP address in the subnet hosting the primary replica). Although you still have to change the HostRecordTTL to a lower threshold.
Leave a Reply
Be the First to Comment!