Skip to main content
Delphix

Preventing SQL Server Msg 9003 Logical Log Corruption on VDBs when Source is Configured for Change Data Capture (CDC) (KBA10296)

 

 

KBA

KBA# 10296

 

Issue

It is possible to provision a VDB from a dSource where the source database has Change Data Capture (CDC) enabled, but the CDC feature is not enabled as you do not want to utilize it. However, the VDB will inherit the CDC configuration based on the process of provisioning. In this situation, if the source database is relatively large, on the order of hundred of GBs to TB range, there is a good possibility the VDB will experience an error as shown below when DB_SYNC occurs:

DB_SYNC on VDB:

DB_SYNC job started for "VDB_Group/VDB_Name".
Error trying to generate a manual checkpoint on virtual database "VDB_Name":
Changed database context to 'VDB_Name'.
CHECKPOINT

Msg 5901, Level 16, State 1, Server SQL_Server_Instance, Line 1
One or more recovery units belonging to database 'VDB_Name' failed to generate a checkpoint. This is typically caused by lack of system resources such as disk or memory, or in some cases due to database corruption. Examine previous entries in the error log for more detailed information on this failure.

Msg 9003, Level 20, State 15, Server SQL_Server_Instance, Line 1
The log scan number (894845:55302:1) passed to log scan in database 'VDB_Name' is not valid. This error may indicate data corruption or that the log file (.ldf) does not match the data file (.mdf). If this error occurred during replication, re-create the publication. Otherwise, restore from backup if the problem results in a failure during startup.

die : Error: SQLCMD_FAILED
At C:\Delphix\DelphixConnector\4200eeee-0333-e111-9222-cddddddddddd-vdb-375\SCRIPT\GenerateCheckpoint.wrap.ps1:29 char:9

Self-Service Container:

Failed Job
Restore Self-Service data container "VDB_Container". [JOB-36406]
Run SnapSync for database "VDB_NAME". [JOB-36407] Error trying to generate a manual checkpoint on virtual database "VDB_NAME": Changed database context to 'VDB_NAME'.
CHECKPOINT

Msg 5901, Level 16, State 1, Server SQL_Server_Instance, Line 1
One or more recovery units belonging to database 'DM_Master' failed to generate a checkpoint. This is typically caused by lack of system resources such as disk or memory, or in some cases due to database corruption. Examine previous entries in the error log for more detailed information on this failure.

Msg 9003, Level 20, State 9, Server SQL_Server_Instance, Line 1
The log scan number (71553746:25928:0) passed to log scan in database 'DM_Master' is not valid. This error may indicate data corruption or that the log file (.ldf) does not match the data file (.mdf). If this error occurred during replication, re-create the publication. Otherwise, restore from backup if the problem results in a failure during startup.

die : Error: SQLCMD_FAILED
At C:\Delphix\DelphixConnector\4200eeee-0333-e111-9222-cddddddddddd-vdb-375\SCRIPT\GenerateCheckpoint.wrap.ps1:29 char:9
+ die "$($args[0])"

The issue occurs during DB_SYNC and the VDB may actually show a "RUNNING" status. When this happens, updates to the VDB data can fail with the Msg 9003 error. Any subsequent CHECKPOINT commands will also fail the same way.

Continue to read to understand how to mitigate this issue before future releases provide actual code relief.

 

Prerequisites

  • Current Delphix engine releases from 6.0.X on upward
  • Source database of the VDB parent dSource (VDB can also be a child of another VDB fitting into the same criteria) has CDC enabled
  • Source database ranges from hundreds of GBs to the TB range
  • Child VDB is provisioned with the CDC option Unchecked

 

Applicable Delphix Versions

Click here to view the versions of the Delphix engine to which this article applies
Date Release
Aug 24, 2023 14.0.0.0
Jul 24, 2023 13.0.0.0
Jun 21, 2023 12.0.0.0
May 25, 2023 11.0.0.0
Apr 13, 2023 10.0.0.0 | 10.0.0.1
Mar 13, 2023 | Mar 20, 2023 9.0.0.0 | 9.0.0.1
Feb 13, 2023 8.0.0.0
Jan 12, 2023 7.0.0.0
Releases Prior to 2023
Major Release All Sub Releases
6.0

6.0.0.0, 6.0.1.0, 6.0.1.1, 6.0.2.0, 6.0.2.1, 6.0.3.0, 6.0.3.1, 6.0.4.0, 6.0.4.1, 6.0.4.2, 6.0.5.0, 6.0.6.0, 6.0.6.1, 6.0.7.0, 6.0.8.0, 6.0.8.1, 6.0.9.0, 6.0.10.0, 6.0.10.1, 6.0.11.0, 6.0.12.0, 6.0.12.1, 6.0.13.0, 6.0.13.1, 6.0.14.0, 6.0.15.0, 6.0.16.0, 6.0.17.0, 6.0.17.1, 6.0.17.2

5.3

5.3.0.0, 5.3.0.1, 5.3.0.2, 5.3.0.3, 5.3.1.0, 5.3.1.1, 5.3.1.2, 5.3.2.0, 5.3.3.0, 5.3.3.1, 5.3.4.0, 5.3.5.0, 5.3.6.0, 5.3.7.0, 5.3.7.1, 5.3.8.0, 5.3.8.1, 5.3.9.0

5.2

5.2.2.0, 5.2.2.1, 5.2.3.0, 5.2.4.0, 5.2.5.0, 5.2.5.1, 5.2.6.0, 5.2.6.1

5.1

5.1.0.0, 5.1.1.0, 5.1.2.0, 5.1.3.0, 5.1.4.0, 5.1.5.0, 5.1.5.1, 5.1.6.0, 5.1.7.0, 5.1.8.0, 5.1.8.1, 5.1.9.0, 5.1.10.0

5.0

5.0.1.0, 5.0.1.1, 5.0.2.0, 5.0.2.1, 5.0.2.2, 5.0.2.3, 5.0.3.0, 5.0.3.1, 5.0.4.0, 5.0.4.1, 5.0.5.0, 5.0.5.1, 5.0.5.2, 5.0.5.3, 5.0.5.4

Resolution

Resolving the VDB in the current state

  • If you receive the "Msg 9003" during a snapshot job, and the VDB shows as RUNNING, you need to resolve the issue to prevent other jobs from failing and also to prevent failures in applications performing Data Manipulation Language (DML), including Updates, Inserts, and Deletes.
  • Stop and Start the VDB. If this succeeds, the Log file has been repaired by SQL Server via the startup process.
  • If still not working, contact Delphix Customer Support.

Workaround to Prevent Msg 9003

The following steps can help prevent the error:

  • Even though the VDB is to be configured without CDC, or if the user prefers to configure CDC to their specific needs without enablement from Delphix, the best thing to do is to keep the "Change Data Capture (CDC)" option enabled on the VDB.
  • If the VDB is already in the state where the CDC option is not enabled on the VDB configuration, one can easily configure the option to be enabled for subsequent Provision jobs (Refreshes, Rollbacks).
    1. From the VDB Configuration > Source page, click the edit pencil so as to configure CDC:
      clipboard_e1762703a7770e1e8e5e3fa072ad5b8a9.png
    2. Check the option:
      clipboard_ee5f247f157bb3c2228a24045e546c8fc.png
    3. Click okay. Now the VDB metadata has the option for CDC set to true. This means during provision jobs, and enable of the VDB, the workflow will enable CDC on the VDB, creating the "cdc" schema tables used on the database to track CDC.
    4. After provisioning, disable CDC on the VDB with this command while using the VDB:
sys.sp_cdc_disable_db

This is best accomplished with a Configure Clone hook. Here is a sample hook:

# Use if you wish to trouble shoot and debug the code.
# Set-PSDebug -Trace 2; 

$VDB_HOST = $env:VDB_INSTANCE_HOST 
$VDB_PORT = $env:VDB_INSTANCE_PORT 
$VDB_INSTANCE = $env:VDB_INSTANCE_NAME 
$VDB_NAME = $env:VDB_DATABASE_NAME 
$CONNECT_STRING = "$VDB_HOST\$VDB_INSTANCE,$VDB_PORT" 

$script:ErrorActionPreference = "Continue" 

function die { 
    Write-Error "Error: $($args[0])" 

    # run exit handler, if defined 
    if (Get-Command -type Function -name atExit 2> $null) { 
        atExit 
    } 
    exit 1 
} 

function verifySuccess { 
    if (!$?) { 
        die "$($args[0])" 
    } 
} 

## sqlcmd 
sqlcmd -l 30 -b -S "$VDB_HOST\$VDB_INSTANCE,$VDB_PORT" -d $VDB_NAME -W -E -s ";" -q "sys.sp_cdc_disable_db"  -o "C:\Users\delphix\DelphixPSLog_$((Get-Date).ToString('MM-dd-yyyy_hh-mm-ss')).log"

Alternatively, one can run the command manually after the provision job (Refresh, for example) and then perform a snapshot on the VDB to preserve the setting. (The Configure Clone hook runs the script after VDB is running, just prior to the snapshot sub-job that runs after the provision).

  1. At this point the VDB will no longer have the "cdc" schema resources and CDC itself is disabled.
  2. You can check CDC status from SSMS (SQL Server Management Service) - select the VDB, right click to select Properties.
  3. Click on Query Store and check the status value for Operation Mode.
  4. The Actual and Requested should both be Off for the CDC to be disabled.
  5. Without the cdc schema resources active or in existence, the incidence of Msg 9003 errors will be practically eliminated.

Troubleshooting

When a Msg 9003 error is received from DB_SYNC on the VDB, check the VDB for CDC schema (cdc tables) and the source of the VDB as well. The most common configuration is provisioning a VDB from a CDC enabled source and NOT checking the "Change Data Capture (CDC)" option for the VDB.

It is often difficult to tell this issue occurs if you are not noticing the lack of timecards from a daily snapshot. The VDB will likely remain in the running state. However, if applications are opening and closing transactions and running CHECKPOINT, the error will occur on that call. 

Looking in the SQL Server ERRORLOG might show a huge number of the messages, indicating the underlying SQL Server processes are trying to flush out the database changes to a log file, only to fail with the Msg 9003.

If this is the case, follow the resolution steps.

 

Related Articles

The following articles may provide more information or related information to this article:

  • N/A