During a SQL Server SnapSync (or Validated Sync) on a Full database backup, the RESTORE HEADERONLY command might timeout with an error similar to the following:
Attempt to run sqlcmd command on remote host "Remote_Host_for_Shared_Backup_Location.Your_Company.com" timed out.
This error can be seen as the failed message text from a SnapSync job on the SQL Server dSource. Delphix is running the RESTORE HEADERONLY command on the designated backup file to be ingested.
Applicable Delphix Versions
- Click here to view the versions of the Delphix engine to which this article applies
Major Release All Sub Releases 6.0 22.214.171.124, 126.96.36.199, 188.8.131.52, 184.108.40.206, 220.127.116.11, 18.104.22.168, 22.214.171.124, 126.96.36.199, 188.8.131.52
184.108.40.206, 220.127.116.11, 18.104.22.168, 22.214.171.124, 126.96.36.199, 188.8.131.52, 184.108.40.206, 220.127.116.11, 18.104.22.168, 22.214.171.124, 126.96.36.199, 188.8.131.52 184.108.40.206, 220.127.116.11, 18.104.22.168, 22.214.171.124, 126.96.36.199, 188.8.131.52
184.108.40.206, 220.127.116.11, 18.104.22.168, 22.214.171.124, 126.96.36.199, 188.8.131.52, 184.108.40.206, 220.127.116.11
18.104.22.168, 22.214.171.124, 126.96.36.199, 188.8.131.52, 184.108.40.206, 220.127.116.11, 18.104.22.168, 22.214.171.124, 126.96.36.199, 188.8.131.52, 184.108.40.206, 220.127.116.11, 18.104.22.168
22.214.171.124, 126.96.36.199, 188.8.131.52, 184.108.40.206, 220.127.116.11, 18.104.22.168, 22.214.171.124, 126.96.36.199, 188.8.131.52, 184.108.40.206 ,220.127.116.11, 18.104.22.168, 22.214.171.124, 126.96.36.199, 188.8.131.52
184.108.40.206, 220.127.116.11, 18.104.22.168, 22.214.171.124, 126.96.36.199, 188.8.131.52, 184.108.40.206
220.127.116.11, 18.104.22.168, 22.214.171.124, 126.96.36.199, 188.8.131.52, 184.108.40.206, 220.127.116.11, 18.104.22.168 , 22.214.171.124, 126.96.36.199
188.8.131.52, 184.108.40.206, 220.127.116.11, 18.104.22.168, 22.214.171.124, 126.96.36.199, 188.8.131.52, 184.108.40.206
To Resolve the Timeout for the RESTORE HEADERONLY Command
We have seen three distinct scenarios where the RESTORE HEADERONLY command may take longer than 10 minutes and timeout.
1. In the first scenario, the staging databases were running on a SQL Server 2016 SP1 instance. There is a known bug in some versions of SQL Server that causes the RESTORE HEADERONLY command to take a long time on databases where Transparent Data Encryption (TDE) is enabled.
|10698847||Fixes an issue in which restore of a compressed backup for a Transparent Data Encryption (TDE) enabled database through the Virtual Device Interface (VDI) interface may fail with the operating system error 38.||SQL Engine|
|10268790||4019893||FIX: Restore fails when you do backup by using compression and checksum on a TDE enabled database in SQL Server 2016||SQL service|
Here's a blog mention on a fix found in 2016 SP1 CU4:
2. Using a fully qualified host name rather than an alias when referencing the network share has proved to be more performant. For example on the dSource's Configuration -> Data Management, tab specify the fully qualified domain name for the "Backup Path":
3. In a few rare cases, we have seen the RESTORE HEADERONLY command take longer than 20 minutes when the source database's log file was very large. After shrinking the log file and then performing a backup, the RESTORE HEADERONLY command on the new backup completed substantially faster.
The best way to isolate the cause and resolution of the SQLCMD timeout is to test the backup file used for the SnapSync (sync) job. In this case you know the backup location and you know the backup file name. To aid in troubleshooting, log in as the Delphix OS User for this staging environment, and then run the following command from the same staging SQL Server instance used to stage your SQL Server dSource.
RESTORE HEADERONLY FROM DISK=N'\\Remote_Host_for_Shared_Backup_Location.Your_Company.com\SQLBackups\Full_Backup_20200829.bak'
You can run this from a tool such as SSMS (SQL Server Management Studio). Normally the RESTORE HEADERONLY command should take a few seconds or complete instantaneously. By default SSMS does not have a timeout set for SQL commands. If it takes more than 10 minutes (the default SQL command timeout in Delphix), you may be running into one of the issues described in this article. If it takes 20 minutes with no response and the source database is TDE enabled, it is reasonable to imply you are hitting the SQL Server bug mentioned and you might consider upgrading if you are running SQL Server 2016 SP1 (or perhaps earlier, like SQL Server 2014, in which case it is not known if there is resolution to the issue on that version of SQL Server).
If you decide to upgrade the staging instance, simply test the command again. In the case this issue is resolved you will likely receive a favorable response in seconds. The RESTORE HEADERONLY reads the backup file and returns the backup header information SQL Server and applications can use to understand aspects of the backup. In the case with Delphix, this information is compared to the source instance backupset information for the backup file. If the information matches (LSNs, backup UUID, recovery fork GUID, etc) then the backup file has been validated and the SnapSync job proceeds to restore the backup file.
The following articles may provide more information or related information to this article:
- KB4019893 - FIX: Restore fails when you do backup by using compression and checksum on a TDE enabled database in SQL Server 2016