Environment Discovery or Refresh of Windows Availability Group Cluster Environment Unable to Discover SQL Server Instances (KBA6724)
KBA
KBA# 6724
Issue
During the SQL Server environment discovery of an Availability Group (AG) Windows cluster there is a possibility that cluster configured instances might not be discovered. During the discovery/refresh of environment, the entire job may not fail, but there can be warnings like the following:
WARNING: Discovery of instance "MSSQLServer" on node "cluster_node_1.com" was skipped because SQL Server Availability Groups are not enabled.
Initially one might suspect a configuration issue with the AG, and this can be checked external to Delphix with the following query. If it returns "1", the AG is enabled. The script Delphix calls is named CheckAGEnabled.ps1, which in turn calls a SQLCMD query as shown here:
SELECT SERVERPROPERTY ('IsHadrEnabled')
There is usually another warning preceding the mentioned warning, and this provides a clue as to the root cause of why this script fails other than if the "IsHadrEnabled" property returns a "0". The resolution section covers various causes.
Applicable Delphix Versions
- Click here to view the versions of the Delphix engine to which this article applies
-
Major Release All Sub Releases 6.0 6.0.0.0, 6.0.1.0, 6.0.1.1, 6.0.2.0, 6.0.2.1, 6.0.3.0, 6.0.3.1, 6.0.4.0, 6.0.4.1, 6.0.4.2, 6.0.5.0 5.3
5.3.0.0, 5.3.0.1, 5.3.0.2, 5.3.0.3, 5.3.1.0, 5.3.1.1, 5.3.1.2, 5.3.2.0, 5.3.3.0, 5.3.3.1, 5.3.4.0, 5.3.5.0 5.3.6.0, 5.3.7.0, 5.3.7.1, 5.3.8.0, 5.3.8.1, 5.3.9.0 5.2
5.2.2.0, 5.2.2.1, 5.2.3.0, 5.2.4.0, 5.2.5.0, 5.2.5.1, 5.2.6.0, 5.2.6.1
5.1
5.1.0.0, 5.1.1.0, 5.1.2.0, 5.1.3.0, 5.1.4.0, 5.1.5.0, 5.1.5.1, 5.1.6.0, 5.1.7.0, 5.1.8.0, 5.1.8.1, 5.1.9.0, 5.1.10.0
5.0
5.0.1.0, 5.0.1.1, 5.0.2.0, 5.0.2.1, 5.0.2.2, 5.0.2.3, 5.0.3.0, 5.0.3.1, 5.0.4.0, 5.0.4.1 ,5.0.5.0, 5.0.5.1, 5.0.5.2, 5.0.5.3, 5.0.5.4
4.3
4.3.1.0, 4.3.2.0, 4.3.2.1, 4.3.3.0, 4.3.4.0, 4.3.4.1, 4.3.5.0
4.2
4.2.0.0, 4.2.0.3, 4.2.1.0, 4.2.1.1, 4.2.2.0, 4.2.2.1, 4.2.3.0, 4.2.4.0 , 4.2.5.0, 4.2.5.1
4.1
4.1.0.0, 4.1.2.0, 4.1.3.0, 4.1.3.1, 4.1.3.2, 4.1.4.0, 4.1.5.0, 4.1.6.0
Resolution of AG not enabled scenario
There are two distinct scenarios that can cause the CheckAGEnabled script to fail without ever connecting to the SQL Server instance.
1. Host connectivity issues
When the Staging host selected to be used for the cluster cannot reach the SQL Server host due to network issues, you might encounter this warning:
WARNING: Discovery of instance "SQLServer_Instance_Name" on node "Node1" failed : Sqlcmd: Error: Microsoft SQL Server Native Client 11.0 : TCP Provider: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond. . Sqlcmd: Error: Microsoft SQL Server Native Client 11.0 : Login timeout expired. Sqlcmd: Error: Microsoft SQL Server Native Client 11.0 : A network-related or instance-specific error has occurred while establishing a connection to SQL Server. Server is not found or not accessible. Check if instance name is correct and if SQL Server is configured to allow remote connections. For more information see SQL Server Books Online.. die : Error: SQLCMD_FAILED At C:\Program Files\Delphix\DelphixConnector\discovery\564d4ebc-b938-6782-41ed-231013c06baf-WINDOWS_CLUSTER-36\temp\CheckAGEnabled.2e33a39b-6694-44d2-966d-955078f6c258.wrap.ps1:26 char:12 + die <<<< "$($args[0])" + CategoryInfo : NotSpecified: (:) [Write-Error], WriteErrorException + FullyQualifiedErrorId : Microsoft.PowerShell.Commands.WriteErrorException,die . WARNING: Discovery of instance "SQLServer_Instance_Name" on node "Node1" was skipped because SQL Server Availability Groups are not enabled.
You can test this connectivity from the Windows connector (staging) host. Login as the Delphix OS user you provided for the cluster environment and then obtain the IP Address and hostname of the node(s) in question as well as the configured listener port of the SQL Server instance. Then run the following commands to isolate the issue and separate it from Delphix. Using port 99999 as an example. You can find this port from the SQL Server instance using the SQL Server Configuration Manager (see next scenario):
sqlcmd -E -l 30 -b -S 'tcp:Node1_hostname\SQLServer_Instance_Name,99999'
sqlcmd -E -l 30 -b -S 'tcp:10.100.160.1\SQLServer_Instance_Name,99999'
If these are returning the same error message, please engage networking and other infrastructure specialists responsible for the connectivity within your implementation. If they connect with no issue, please contact Customer Support at Delphix for assistance in troubleshooting the issue.
If the network issue is uncovered and resolved, please proceed with the discovery or refresh of the cluster environment.
2. SQL Server instance connection string is invalid
If there is a problem in obtaining the proper connectivity parameters to the source SQL Server instance you might see this warning:
WARNING: While discovering Windows Cluster "AG_Cluster" the following nodes were unreachable from proxy host "10.100.160.251" and will be skipped: [(name: Node1, IPs: [10.100.160.1]) (name: Node2, IPs: [10.100.160.2])]. WARNING: Discovery of instance "MSSQLSERVER" on node "Node1" failed : Sqlcmd: Error: Microsoft ODBC Driver 13 for SQL Server : SQL Server Network Interfaces: Connection string is not valid [87]. . Sqlcmd: Error: Microsoft ODBC Driver 13 for SQL Server : Login timeout expired. Sqlcmd: Error: Microsoft ODBC Driver 13 for SQL Server : A network-related or instance-specific error has occurred while establishing a connection to SQL Server. Server is not found or not accessible. Check if instance name is correct and if SQL Server is configured to allow remote connections. For more information see SQL Server Books Online.. die : Error: SQLCMD_FAILED At C:\Program Files\Delphix\DelphixConnector\discovery\4231a44b-9d11-69f9-b999-36d635b39b3c-WINDOWS_CLUSTER-1\temp\CheckAGEnabled.b1f8dffc-7c06-4d4b-bf01-211ab9c95899.wrap.ps1:26 char:12 + die <<<< "$($args[0])" + CategoryInfo : NotSpecified: (:) [Write-Error], WriteErrorException + FullyQualifiedErrorId : Microsoft.PowerShell.Commands.WriteErrorException,die . WARNING: Discovery of instance "MSSQLSERVER" on node "Node1" was skipped because SQL Server Availability Groups are not enabled.
The key portion of the message is "Connection string is not valid". In most cases this occurs because the SQLCMD connection string contains a value of "0" for the port. You can check this by using the SQL Server Configuration Manager on the source node(s) experiencing the issue. In that tool you will start with selecting the SQL instance in question. In this example the instance name is SQL2016:
Select "TCP/IP" from the right side under the "Protocol Name" column.
Select the "IP Addresses" tab.
Scroll down past IP1 through IP4 all the way to IPAll.
If the issue is caused by TcpPort = 0, then the IPAll fields "TCP Dynamic Ports" and "TCP Port" will be blank. When Delphix calls the CheckAGEnabled.ps1 Powershell script, it passes SQL commands to the SQL Server instance with a SQL script that is passed by an invocation of SQLCMD. The connection string uses the port values from IPAll. Delphix checks "TCP Port" first, and, if empty, checks "TCP Dynamic Ports". Then a command is build, leading with the connection string:
sqlcmd -E -l 30 -b -S 'tcp:10.100.160.1\SQL_Instance_Name,0' ....
This is an invalid connection string and causes the warning. As seen above in the IPAll image, enter the correct port into the appropriate field. Ask your SQL Server DBA to check the port value and make the change if you are not authorized for this activity. If the port is static, then set port value to "TCP Port" and if dynamic, set to "TCP Dynamic Ports". Presumably the SQL Server is already running and listening on the correct port. At this point you can refresh the environment, or discover it again if it was deleted. There may be scenarios where the SQL Server needs to be restarted, such as, if you weren't configured for TCPIP and are just now setting the port in the SQL Server Configuration Manager. After restarting, you can discover/refresh the source cluster environment. Here is the value set in IPAll:
Troubleshooting
As demonstrated in the resolution section above, you can test connectivity to the SQL instance with SQLCMD. To proceed with this follow these steps:
- Since you are troubleshooting connectivity from the staging host to the source node(s), you will want to create a session logged in as the OS user used for the environment discovery/refresh. The session can be command prompt or powershell.
- Based on the type of problem (host access or port 0) you can build the SQLCMD connection string as such:
- Host access:
sqlcmd -E -l 30 -b -S 'tcp:Node1_hostname\SQLServer_Instance_Name,99999'
- Alternatively you can also test with the host IP Address:
sqlcmd -E -l 30 -b -S 'tcp:10.100.160.1\SQLServer_Instance_Name,99999'
- Port 0:
sqlcmd -E -l 30 -b -S 'tcp:Node1_hostname\SQLServer_Instance_Name,0'
If warning or error messages match what Delphix displays, you are likely on the right track and can proceed with resolution steps.
Related Articles
The following articles may provide more information or related information to this article: