Skip to main content
Delphix

Masking SQL Server - Hangs, Blocks and Deadlocks (KBA6319)

 

KBA

KBA# 6319

 

At a Glance   

Description This article details possible root causes and solutions to Blocks and Deadlocks on MS SQL Server. 
Root cause SQL Server uses locks on objects to prevent conflicts between concurrent transactions and ensure data consistency. 

Usually, this happens (but not always) between SELECT and UPDATE.

Since masking uses multiple transactions, these locks can in rare cases cause the database operations to hang (block). A special case of a block is when SQL Server detects the block and terminates a transaction - this terminates the execution with a Deadlock error.
How to identify: Below is a high-level overview to Identify hung jobs and how to relate these to SQL Server:
  Step 1a.
Job Monitor
On the masking engine, there are some clear signs that the job has hung due to the Database blocks:
  1. Masking performance drops without more records being masked.
  2. The Masked Rows is stuck at a multiple of Commit Size -1 (for example 9,999). 
  3. The logs show no actions (there are multiple Throughput logger entries like this one):
...
ThroughputLogger:interval 0.000 MB (0 rows) counted in 30.000 sec; throughput = 0.000 MB/sec
...
Step 1b.
Job Monitor
On the masking engine, the job never starts processing rows.

This is a special case and there can be blocks when the masking job is preparing for masking (like creating MASKING_GENERATED_IDENTITY_TMP). 
  1. The masking job is not starting (there might not be a row count).
  2. The logs are not showing any progress or useful info (since the engine is waiting for the database).
Step 2.
SQL Blocks
Use SQL Server to verify if and what Transaction is blocked.

Use the following SQL commands (and look for blocked transactions):
Deadlocks SQL Server has a deadlock reporting tool.

To view the Deadlock, open SQL Management Studio:

  1. In the Object Explorer - open your database
  2. Go to Management > Extended Events > Sessions > system_health > package0.event_file 
  3. Click View Target Data….
  4. Search for the deadlock event
Solutions: If the block is on SELECT and UPDATE, then there are three tested solutions - these are:
1 Clustered Index Creating a Clustered Index is the sure way to resolve this issue. Follow the steps in this KBA:
2 Read
Committed
Snapshot
If a Clustered Index is not possible or if Complex Custom SQL is used - RCSI might be the way to go:
3 OTF Masking Another way to separate Read and Update on a table is to use On-The-Fly (OTF) masking:
General best practices

There are some best practices for an MS SQL In-Place masking job. These points will reduce the chance of encountering a hang (or deadlock). 

The SELECT statement fetching the data will define the order in which data is read. The best option is when all rows are retrieved in a Logical Order (rows are fetched page by in consecutive order and each page only accessed once). A Clustered Index Scan will do this.

Note: If another Scan is performed (such as NonClustered Scan) - please investigate. For NonClustered, this means there is an index that needs to be dropped. 


Therefore, consider the following: 

  1. Use a Clustered Index as the Unique Row Identifier (Key).  
    For best performance and execution, drop all existing indexes on the table prior to creating the Clustered Index.
     
  2. Try to avoid using WHERE and JOIN.
    Depending on the Execution Plan, these might change the record order or cause multiple scans.
     
  3. Avoid using: UNION and ORDER BY.
    These will cause multiple scans / change the order.
     
  4. Ensure that there is no external transaction running which will change the data on the masked table. This includes Triggers (which should be disabled in this case). 
More Info More info on how to access logs: 


SQL Server related information:

Applicable Delphix Versions

Click here to view the versions of the Delphix engine to which this article applies
Major Release All Sub Releases
6.0 6.0.0.0, 6.0.1.0, 6.0.1.1, 6.0.2.0

5.3

5.3.0.0, 5.3.0.1, 5.3.0.2, 5.3.0.3, 5.3.1.0, 5.3.1.1, 5.3.1.2, 5.3.2.0, 5.3.3.0, 5.3.3.1, 5.3.4.0, 5.3.5.0 5.3.6.0, 5.3.7.0, 5.3.7.1, 5.3.8.0, 5.3.8.1, 5.3.9.0

5.2

5.2.2.0, 5.2.2.1, 5.2.3.0, 5.2.4.0, 5.2.5.0, 5.2.5.1, 5.2.6.0, 5.2.6.1

5.1

5.1.0.0, 5.1.1.0, 5.1.2.0, 5.1.3.0, 5.1.4.0, 5.1.5.0, 5.1.5.1, 5.1.6.0, 5.1.7.0, 5.1.8.0, 5.1.8.1, 5.1.9.0, 5.1.10.0

5.0

5.0.1.0, 5.0.1.1, 5.0.2.0, 5.0.2.1, 5.0.2.2, 5.0.2.3, 5.0.3.0, 5.0.3.1, 5.0.4.0, 5.0.4.1 ,5.0.5.0, 5.0.5.1, 5.0.5.2, 5.0.5.3, 5.0.5.4

4.3

4.3.1.0, 4.3.2.0, 4.3.2.1, 4.3.3.0, 4.3.4.0, 4.3.4.1, 4.3.5.0

4.2

4.2.0.0, 4.2.0.3, 4.2.1.0, 4.2.1.1, 4.2.2.0, 4.2.2.1, 4.2.3.0, 4.2.4.0 , 4.2.5.0, 4.2.5.1

4.1

4.1.0.0, 4.1.2.0, 4.1.3.0, 4.1.3.1, 4.1.3.2, 4.1.4.0, 4.1.5.0, 4.1.6.0

Hung and deadlocked masking jobs

One of the more challenging problems occurs when a job on the Masking Engine hangs due to MS SQL Server becoming unresponsive due to a lock. If MS SQL Server decides to handle this situation by terminating one of the processes, this is called a deadlock.

It is important to note that this is a SQL Server Database issue and not a Delphix configuration issue. A good understanding of how MS SQL Server works and your database's internal structures is essential for resolving these issues.

  • A hung masking job arises due to a lock between the UPDATE and SELECT operations. One possible reason is that the rows are not being read in the logical order they are stored, leading to conflicts in the locking mechanism. A possible root cause here is FORWARDED RECORDS (which requires the SELECT to jump to another page in order to read the data).
     
  • The lock above is the most common issue. However, there are also other locking scenarios:
    • Blocks on Pre-SQL-Scripts 
    • Blocks when creating MASKING_GENERATED_IDENTITY_TMP
    • Blocks and Deadlocks when disabling/dropping Constraints (such as FKs)

Animation of Masking Job on SQL Server

If the data in the table is arranged in a logical order, it should not lead to blocking or deadlocks during the masking job.

The animation below shows a normal masking job and how the locks are successfully taken and released

What success looks like

The job below is masking a column called 'Mask' using a Unique Row Identifier (URI or key) called 'ID'. The ID column is a Primary Key and is Indexed using a Clustered Index

Given this is an In-Place masking job, it necessitates reading and updating the data simultaneously.

Note 1: Yes, the page locks below (X) and (S) are not compatible. In this case, this is acceptable due to the following:

  • No locks are taken on the same page at the same time.
    • The SELECT locks are very brief and on two pages at the same time (the locks overlap to maximize performance).
    • The UPDATE does lock pages to update 10,000 (Commit Size) records - this is on different pages due to buffered rows.

Note 2: As few as possible numbers of locks are taken - the locks in this example are (KEY, PAG, TAB). 

 

MS SQL Moving v5 (fast input w gauge).gif

Blocks and Deadlocks

Below, we will look at blocks and deadlock.

These happen when:

  • The record order of the fetch is scattered (i.e. due to the order in the index) resulting in scattered page locks on SELECT
  • In turn, this causes the UPDATE to be scattered
  • A larger number of locks are taken - examples of lock types are RID, KEY, PAGE, and TAB.
  • The number of pages locked increases when WHERE is used. 
    • This can be seen in the example further down on this page where the index is a NonClustered Index. 
  • Lock escalations - can be triggered by resource constraints in the SQL Server database.

Masking job is hung due to lock

On SQL Server

The most common lock type here is:

LCK_M_IX: this wait type occurs when a thread is waiting for a table or page IX lock so that a row insert or update can occur. 

The lock is usually on a PAGE (from MS SQL). 

lastwaittype  | LCK_M_IX                        
waitresource  | PAG: 9:21:47483425   

Masking Engine logs

The Masking Logs will not show any error here. The masked row count will though stop at '(COMMIT_SIZE x n) - 1'. For example 9,999. 

The Update in Masking Logs will stop and the Job Monitor will (in this case) show something like 1x,9999 rows masked:

[JOB_ID_XX_YY] ... Table input.0 - linenr 50000
[JOB_ID_XX_YY] ... Table input.0 - linenr 100000
[JOB_ID_XX_YY] ... Table input.0 - linenr 150000
[JOB_ID_XX_YY] ... DelphixTableUpdate.0 - linenr 50000
[JOB_ID_XX_YY] ... Table input.0 - linenr 200000
[JOB_ID_XX_YY] ... Table input.0 - linenr 250000
[JOB_ID_XX_YY] ... DelphixTableUpdate.0 - linenr 100000
[JOB_ID_XX_YY] ... Table input.0 - linenr 300000
[JOB_ID_XX_YY] ... Table input.0 - ThroughputLogger:interval x MB (x rows) counted in 30.000 sec; throughput = x MB/sec
...
[JOB_ID_XX_YY] ... ThroughputLogger:interval 0.000 MB (0 rows) counted in 30.000 sec; throughput = 0.000 MB/sec
[JOB_ID_XX_YY] ... ThroughputLogger:interval 0.000 MB (0 rows) counted in 30.000 sec; throughput = 0.000 MB/sec
[JOB_ID_XX_YY] ... ThroughputLogger:interval 0.000 MB (0 rows) counted in 30.000 sec; throughput = 0.000 MB/sec
...

Masking job deadlocked

A deadlock is when MS SQL Server's deadlock monitor thread has identified an unresolvable block and terminates one of the processes. This will resolve the lock but from a masking job's point of view, the job has terminated.

Masking Engine logs

The error seen in the Masking Logs is something like this: 

[JOB_ID_XX_YY] ... Transaction (Process ID nnn) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.

How to resolve

To solve these issues, the main things are:

  • ensure that the data in the table is stored in a Logical Order.
  • ensure that the data is Fetched in this Logical Order (in the SELECT).
  • limit the number of locked pages. 

General Best Practices

These are some best practices for an MS SQL In-Place masking job and will reduce the chance of encountering a hang (or deadlock). 

  • Use a Clustered Index as the Unique Row Identifier (Key).  
     
  • Try to avoid using WHERE and JOIN
    •  Depending on the Execution Plan, these might change the record order or cause multiple scans.
       
  • Avoid using: UNION and ORDER BY 
    • These will cause multiple scans / change the order.
       
  • Ensure that there is no external transaction running which will change the data on the masked table. This includes Triggers (which should be disabled in this case). 

Solutions

If the block is on SELECT and UPDATE, then there are three tested solutions - these are:

1 Clustered Index Creating a Clustered Index is the sure way to resolve this issue. Follow the steps in this KBA:
2 Read
Committed
Snapshot
If Clustered Index is not possible or if Complex Custom SQL is used - RCSI might be the way to go:
3 OTF Masking Another way to separate Read and Update on a table is to use On-The-Fly (OTF) masking:

Why do lock issues happen?

While we may not have a complete understanding of all the reasons behind lock issues and their exact mechanics, it is evident that for a lock to occur, data needs to be fetched from pages in a random order. 

With this information, let's look at some reasons why the lock can happen. 

Table: FORWARDED RECORDS

FORWARDED RECORD refers to a data row that has been moved from its original data page to a new page due to data modification operations (update or alteration). The data row is moved when the existing data page no longer has enough space to accommodate the modified row. Consequently, SQL Server creates a forwarding pointer on the original page that points to the new location of the row.

Salient notes:

  • This only happens on a table without a Clustered Index (a Heap Table).
  • When the SELECT fetches data with forwarded records, SQL Server must follow the forwarding pointers to a different page.
    • This increases the risk of locks and can lead to slower performance. 
  • ​​​​​​The generation of MASKING_GENERATED_IDENTITY_TMP can cause FORWARDED RECORDS.

Managing forwarded records is a DBA operation and is crucial for maintaining optimal database performance and avoiding blocking issues.

Table: Type of INDEX and Heap Table

Since one of the main root causes for blocking issues is FORWARDED RECORDS, using the right Index will help. 

  • The Update Key (PK, Identity column, and Logical Key) should have a Clustered Index.
  • If the URI is indexed using a Nonclustered Index - this could cause the records and pages read to be scattered when accessed (and updated)
  • If the table is a Heap table and there is no PK the masking engine will create MASKING_GENERATED_IDENTITY_TMP. In the current version, the index is a Nonclustered Index. 

Rule Set: WHERE clause 

Columns that are indexed in the predicate of a WHERE clause will (very) likely alter the execution plan and impact the order of row retrieval.

  • If a WHERE clause is used, any columns in the predicate should not be Indexed.

Rule Set: Custom SQL with ORDER BY 

This should not be used. ORDER By changes the order in which the data is retrieved. 

Note: I can think of one exception - to ensure the data is retrieved in a Logical and Optimal order (for example when WHERE is used).

Rule Set: Custom SQL with UNION 

This should not be used. UNION will cause multiple scans of the table.

  • If needed, rewrite to not use UNION.
  • If not possible, use READ COMMITTED SNAPSHOT (see KBA).

Masking: Row count executed while table being masked

Before the masking job starts, the masking engine will execute a SELECT COUNT on all tables in the masked Rule Set. Most of the time, the result of this is returned before the job even starts.

It has been noticed that the SELECT COUNT can in some cases take a long time and that the query is running while the masking job is masking the same table. This can cause a block (or deadlock) since the number of Page locks increases. 

SQL Server: SELECT Parallel processing 

If SQL Server has access to multiple threads the database might use multiple threads to fetch the data. This means that different pages will be accessed and the fetch can be out of order which in the end can result in a lock with the UPDATE.

SQL Server: Lock Escalation

Lock escalation is MS SQL Server's process of converting many small locks (such as page locks) into table locks. MS SQL Server will dynamically determine when to perform a lock escalation.

It is also possible that there are limited resources on the SQL Server. 

Note: It is not known if MS SQL Server will ever cause an escalation when masking data. This should not happen as there are separate connections between UPDATE and SELECT. 

SQL Server: Other 

There are some technologies that can cause blocks. These are: 

  • Parallel processing (mentioned above).
  • Multiple concurrent executions and "merry-go-round scanning".
  • B-Tree structure and IAM pages. 
  • Filtering and "predicate pushdown".

Troubleshooting

In order to understand why a lock or deadlock happened, we need to collect data from the database. This needs to be done when the lock is happening.

Use the SQL commands below to collect data: 

Tasks - specific to Deadlock

When a deadlock happens there are no active sessions left on the database.

To find deadlocks, use the MS SQL Serverdeadlock reporting tool.

  • Open SQL Server Management Studio:
    1. In the Object Explorer - open your database
    2. Go to Management > Extended Events > Sessions > system_health > package0.event_file 
    3. Click View Target Data….
    4. Search for the deadlock event
  • Note: the recent items are at the end of the list. 

Example - Server Deadlock Reporting Tool

xml_report

<deadlock>
    <victim-list>
        <victimProcess id="process1c190257468"/>
    </victim-list>
    <process-list>
        <process id="process1c190257468" taskpriority="0" logused="0" waitresource="PAGE: 7:1:13463 " waittime="15389" ownerId="324962" transactionname="implicit_transaction" lasttranstarted="2020-10-20T21:04:35" XDES="0x1c19ebe8428" lockMode="S" schedulerid="1" kpid="7724" status="suspended" spid="57" sbid="0" ecid="0" priority="0" trancount="1" lastbatchstarted="2020-10-20T21:04:34.963" lastbatchcompleted="2020-10-20T21:04:34.957" lastattention="1900-01-01T00:00:00.957" clientapp="Microsoft JDBC Driver for SQL Server" hostname="YourME" hostpid="0" loginname="user" isolationlevel="read committed (2)" xactid="324962" currentdb="7" currentdbname="Test" lockTimeout="4294967295" clientoption1="671088672" clientoption2="128058">
            <executionStack>
                <frame procname="adhoc" line="1" stmtend="86" sqlhandle="0x0200000048e01434b96494e5263ed6bc0a64fdc729f06e2d0000000000000000000000000000000000000000">  unknown    
       </frame>
            </executionStack>
            <inputbuf>  SELECT &quot;Mask&quot; , &quot;ID&quot; FROM &quot;dbo&quot;.&quot;Heap_2_Col&quot;   
     </inputbuf>
        </process>
        <process id="process1c1909b6108" taskpriority="0" logused="2594536" waitresource="PAGE: 7:1:10106 " waittime="3511" ownerId="326162" transactionname="implicit_transaction" lasttranstarted="2020-10-20T21:14:31.797" XDES="0x1c195df0428" lockMode="IX" schedulerid="1" kpid="5932" status="suspended" spid="60" sbid="0" ecid="0" priority="0" trancount="2" lastbatchstarted="2020-10-20T21:14:31.767" lastbatchcompleted="2020-10-20T21:14:31.677" lastattention="1900-01-01T00:00:00.677" clientapp="Microsoft JDBC Driver for SQL Server" hostname="YourME" hostpid="0" loginname="user" isolationlevel="read committed (2)" xactid="326162" currentdb="7" currentdbname="Test" lockTimeout="4294967295" clientoption1="671088672" clientoption2="128058">
            <executionStack>
                <frame procname="adhoc" line="1" stmtstart="60" stmtend="192" sqlhandle="0x020000009fb93f2080f3c22eb583aea908559018e477dd910000000000000000000000000000000000000000">  unknown    
     </frame>
            </executionStack>
            <inputbuf>  (@P0 varchar(8000),@P1 bigint)UPDATE &quot;dbo&quot;.&quot;Heap_2_Col&quot; SET &quot;Mask&quot; = @P0 WHERE  ( ( ID = @P1  ) )    </inputbuf>
        </process>
    </process-list>
    <resource-list>
        <pagelock fileid="1" pageid="13463" dbid="7" subresource="FULL" objectname="Test.dbo.Heap_2_Col" id="lock1c17700a900" mode="IX" associatedObjectId="72057594044350464">
            <owner-list>
                <owner id="process1c1909b6108" mode="IX"/>
            </owner-list>
            <waiter-list>
                <waiter id="process1c190257468" mode="S" requestType="wait"/>
            </waiter-list>
        </pagelock>
        <pagelock fileid="1" pageid="10106" dbid="7" subresource="FULL" objectname="Test.dbo.Heap_2_Col" id="lock1c177269580" mode="SIU" associatedObjectId="72057594044350464">
            <owner-list>
                <owner id="process1c190257468" mode="S"/>
            </owner-list>
            <waiter-list>
                <waiter id="process1c1909b6108" mode="IX" requestType="convert"/>
            </waiter-list>
        </pagelock>
    </resource-list>
</deadlock>

Example - Trace from the deadlock

The following example was captured from a SQL Trace. A trace needs to be captured while the deadlock happens. Please see below how to capture a trace.

Look for the following EventClasses:

  • 25 (Lock: Deadlock event) - Deadlock event.
  • 26 (Lock: Cancel event) - Lock acquisition has been canceled (for example, due to deadlock). 
  • 27 (Lock: Timeout) - Lock timed out due to blocking lock on a required resource. 
  • 59 (Lock: Deadlock Chain) - Chained events leading up to the deadlock.
  • 148 (Deadlock Graph) - XML description of the deadlock.

 

Note: The XML used to create this graph is the same as the XML above. There might be a tool to create these outside the trace tool. 

KBA6319 - Deadlock Trace.png

Tasks - specific to Hangs

SQL #1 - Current locks

Run this MS Store Procedure from the MS SSMS. The command will show the current locks: 

sp_lock

Example Output

The example below is from a masking job with DMS_ROW_ID and shows: 

  • spid: The session ID form the process.
  • dbid: The database ID. 
  • ObjId: The database object ID.
  • IndId: The index ID. 
  • Type: The lock type.
  • Resource: The resource that is locked. 
  • Mode: The lock mode. 
  • Status: The lock status, CNVRT = Being converted, GRANT = granted, WAIT = is waiting for another process. 

For details see: https://docs.microsoft.com/en-us/sql...l-server-ver15 

+------+-------+------------+-------+------+-------------+-------+--------+
| spid | dbid  | ObjId      | IndId | Type | Resource    | Mode  | Status |
+------+-------+------------+-------+------+-------------+-------+--------+
| 54   | 7     | 0          | 0     | DB   |             | S     | GRANT  |
| 54   | 32767 | -571204656 | 0     | TAB  |             | Sch-S | GRANT  |
| 54   | 1     | 1467152272 | 0     | TAB  |             | IS    | GRANT  |
| 54   | 7     | 0          | 0     | DB   |             | S     | GRANT  |
| 54   | 7     | 0          | 0     | DB   |             | S     | GRANT  |
| 54   | 7     | 0          | 0     | DB   |             | S     | GRANT  |
+------+-------+------------+-------+------+-------------+-------+--------+
| 57   | 7     | 1330103779 | 0     | RID  | 1:339689:45 | X     | GRANT  |
| 57   | 7     | 1330103779 | 0     | RID  | 1:339688:45 | X     | GRANT  |
| 57   | 7     | 1330103779 | 0     | RID  | 1:339687:61 | X     | GRANT  |
...
+------+-------+------------+-------+------+-------------+-------+--------+

SQL #2 - Current SQL 

Run this SQL from MS SSMS. It shows currently running SQL processes coming from a JDBC connection (that is, the Masking Engine). 

SELECT (SELECT [text] FROM 
   sys.dm_exec_sql_text(sql_handle)) AS SQLCommand,
   spid AS SPID, 
   status AS Status,
   hostname AS Hostname, 
   hostprocess AS HostProcess, 
   open_tran AS OpenTrans, 
   cmd AS Command,
   blocked AS Blocked, 
   CONVERT(VARCHAR(19), waittime) AS WaitTime,
   CASE waittype WHEN 0x0000 THEN SPACE(10) ELSE waitresource END as Waiting, 
   login_time AS LoginTime
FROM 
   sys.sysprocesses WITH (NOLOCK)
WHERE
   program_name = 'Microsoft JDBC Driver for SQL Server'

Example Output

The example below shows a lock between UPDATE and SELECT. The UPDATE lock is older and the SELECT has recently been taken.

  • SQLCommand: The active SQL commands. 
  • SPID: The session ID form the process.
  • Status: Current status. The process below is healthy. SELECT is sleeping since buffers are full and UPDATE is running. 
  • Hostname: The host issued the SQL commands (should be the masking engine and nothing else).
  • HostProcess: Workstation process ID number.
  • OpenTrans: Number of open transactions for the process (should be 1).
  • Command: Internal SQL Server Command currently being executed.
  • Blocked: 0 (or NULL) = not blocked. -2 = blocked by orphaned trans. -3 = blocked by deferred recovered trans. 
  • WaitTime: Current wait time in milliseconds
  • Waiting: Textual representation of a lock resource (no lock in this example).
  • LoginTime: Time this client process was logged in.

For more details see: https://docs.microsoft.com/en-us/sql...l-server-ver15

+------------------------------------------------------------+------+-----------+-------------+-------------+-----------+---------+---------+----------+----------------+-------------------------+
| SQLCommand                                                 | SPID | Status    | Hostname    | HostProcess | OpenTrans | Command | Blocked | WaitTime | Waiting        | LoginTime               |
+------------------------------------------------------------+------+-----------+-------------+-------------+-----------+---------+---------+----------+----------------+-------------------------+
| SELECT "Mask" , "DMS_ROW_ID" FROM "dbo"."YOUR_TABLE"       | 54   | suspended | YOUR_ME.com | 0           | 1         | SELECT  | 0       | 849      |                | 2020-09-28 17:58:56.077 |
| (@P0 varchar(8000),@P1 bigint)UPDATE "dbo"."YOUR_TABLE"... | 57   | suspended | YOUR_ME.com | 0           | 2         | UPDATE  | 67      | 394723   | PAG: 7:1:10371 | 2020-09-28 17:58:56.077 |
+------------------------------------------------------------+------+-----------+-------------+-------------+-----------+---------+---------+----------+----------------+-------------------------+

Tasks - locks

Task L1 - SQL Table Metadata

Run this SQL from MS SSMS. Change the table name 'YOUR_TABLE' to the table you want to investigate. It will show table metadata.

SELECT 
    t.NAME AS TableName,
    p.rows AS RowCounts,
    SUM(a.total_pages) AS TotalPages, 
    SUM(a.used_pages) AS UsedPages, 
    (SUM(a.total_pages) - SUM(a.used_pages)) AS UnusedPages,
p.rows/SUM(a.used_pages) AS RowsPerPage,
SUM(a.used_pages)*10000/p.rows AS PagesPerCommit
FROM 
    sys.tables t
INNER JOIN      
    sys.indexes i ON t.OBJECT_ID = i.object_id
INNER JOIN 
    sys.partitions p ON i.object_id = p.OBJECT_ID AND i.index_id = p.index_id
INNER JOIN 
    sys.allocation_units a ON p.partition_id = a.container_id
WHERE 
    i.OBJECT_ID > 255 
AND t.NAME = 'YOUR_TABLE' 
GROUP BY 
    t.Name, p.Rows
ORDER BY 
    t.Name

Example Output

The example below shows.

  • RowCount: Total number of rows in the table.
  • TotalPages: Total number of Pages assigned for the table. 
  • UsedPages: The number of Pages used by the table.
  • UnusedPages: The number of Pages currently not used. 
  • RowsPerPage: The average number of Rows per Page (if this number is small then the database will need to read many pages to get to and update the data which will affect the performance).
  • PagesPerCommit: The average number of pages updated on a standard COMMIT SIZE (10,000). This value is related to 'RowsPerPage'. A larger number here will affect the performance as UPDATE will need to update more pages.  
+------------+----------+------------+-----------+-------------+-------------+----------------+
| TableName  | RowCount | TotalPages | UsedPages | UnusedPages | RowsPerPage | PagesPerCommit |
+------------+----------+------------+-----------+-------------+-------------+----------------+
| YOUR_TABLE | 509223   | 1937       | 1933      | 4           | 263         | 37             |
+------------+----------+------------+-----------+-------------+-------------+----------------+

Task L2 - SQL Indexes

Run this SQL from MS SSMS. Change the table name 'YOUR_TABLE' to the table you want to investigate. 

 

Note

Notes:

  • The 'like' and '%', alternatively, include the schema.
  • Indexes are removed and added as part of the masking job and might not be in the list.

 

select 
    schema_name(o.schema_id) + '.' + o.name as 'Table_View', 
    case when o.type = 'U' then 'Table'
        when o.type = 'V' then 'View'
        end as 'Object type',
    c.name 'Column name',
    t.Name 'Data type',
    c.is_identity as 'Ident',
    c.is_computed as 'Comp',
    i.[name] as 'Index name',
    i.object_id as "Index ObjID",
    case when i.type = 1 then 'Clustered index'
        when i.type = 2 then 'Nonclustered unique index'
        when i.type = 3 then 'XML index'
        when i.type = 4 then 'Spatial index'
        when i.type = 5 then 'Clustered columnstore index'
        when i.type = 6 then 'Nonclustered columnstore index'
        when i.type = 7 then 'Nonclustered hash index'
        end as 'Index type',
    case when i.is_unique = 1 then 'Yes'
        else 'No' end as 'Unique'
from 
    sys.objects o
inner join 
    sys.indexes i on o.object_id = i.object_id
inner join 
    sys.index_columns ic on ic.object_id = o.object_id and ic.index_id = i.index_id
inner join
    sys.columns c on ic.object_id =  c.object_id and ic.column_id = c.column_id
INNER JOIN 
    sys.types t ON c.user_type_id = t.user_type_id
where o.is_ms_shipped <> 1
and i.index_id > 0
and o.name like '%YOUR_TABLE'

Example Output

The example below shows.

  • Object type: The type of object (Table or View - this should be Table). 
  • Column name: The name of the indexed column.
  • Data type: What is the data type of the column?
  • Ident: Is this an Identity column? 
  • Comp: Is this a Computed column?
  • Index name: The name of the index.
  • Index type: Type of index.
  • Unique: Is there a unique constraint on the index?
+----------------+-------------+-------------+-----------+-------+------+--------------+---------------------------+--------+
| Table_View     | Object_type | Column Nnme | Data type | Ident | Comp | Index name   | Index type                | Unique |
+----------------+-------------+-------------+-----------+-------+------+--------------+---------------------------+--------+
| dbo.YOUR_TABLE | Table       | DMS_ROW_ID  | int       | 1     | 0    | i_DMS_ROW_ID | Nonclustered unique index | No     |
+----------------+-------------+-------------+-----------+-------+------+--------------+---------------------------+--------+

Task 3 - Table Describe 

Use Management Studio to 'Script Table as...' and paste the table and index definitions into your support case.

Task 4 - Execution Plan 

Get the Execution Plan for the query and investigate:

  • Type of Scan (Table or Index)
  • Number of Threads.
  • Type of  

Task 5 - Capture MS SQL Profile 

To collect data when the lock is happening - the optimal tool is to capture a SQL profile. Using the template below will capture all relevant lock types, modes, and data. 

Use this Template file to capture a trace of the Masking Job.

Template: Delphix_InvestigateLocks.tdf

 

More SQL Queries: 

SELECT
    OBJECT_NAME(ps.object_id) as TableName,
    i.name as IndexName,
    ps.index_type_desc,
    ps.page_count,
    ps.avg_fragmentation_in_percent,
    ps.forwarded_record_count
FROM sys.dm_db_index_physical_stats (DB_ID(), NULL, NULL, NULL, 'DETAILED') AS ps
INNER JOIN sys.indexes AS i
    ON ps.OBJECT_ID = i.OBJECT_ID  
    AND ps.index_id = i.index_id
WHERE OBJECT_NAME(ps.object_id) = 'Demo'

 

Warning

Warning:

These traces can be extremely large.

 

Example - Saved as Table

The example below shows the locks taken on SELECT COUNT (this is in most of the cases the same as the just a SELECT) (the difference is that SELECT COUNT will have an aggregate and return one row). The Index in this example is Clustered. 

KBA SSMS - SQL Profile Example.png

Sample locks

Below are two examples from masking the same data but with two different types of Indexes.

In the first example, the table has an ID column that is a Primary Key that is indexed with a Clustered Index. The second example has no Primary Key and hence the masking engine has created DMS_ROW_ID with a Nonclustered Index.

The data collected is from MS SQL Profile Traces from each execution.

Note

Note:

The Nonclustered examples below is simplified slightly. Portions of the larger transaction have not been included.

 

Clustered Index

Locks and SELECT

Salient observations in the SELECT:

  • A Page is only acquired once, and all rows in the page are read.
  • The Page locks are overlapping (likely to maximize read performance).
  • When the buffers are filled - the fetch pauses and then resumes and fetches another group of rows.
+----------+------------+--------+---------------+
| TextData | Type       | Mode   | EventClass    |
+----------+------------+--------+---------------+
| SELECT "Mask", "ID" FROM "dbo"."YOUR_TABLE"    |
|          | 5 - OBJECT | 6 - IS | Lock:Acquired |
| 1:2456   | 6 - PAGE   | 3 - S  | Lock:Acquired |
| 1:2458   | 6 - PAGE   | 3 - S  | Lock:Acquired |
| 1:2456   | 6 - PAGE   | 3 - S  | Lock:Released |
| 1:2459   | 6 - PAGE   | 3 - S  | Lock:Acquired |
| 1:2458   | 6 - PAGE   | 3 - S  | Lock:Released |
| 1:2460   | 6 - PAGE   | 3 - S  | Lock:Acquired |
| 1:2459   | 6 - PAGE   | 3 - S  | Lock:Released |
| 1:2461   | 6 - PAGE   | 3 - S  | Lock:Acquired |
| 1:2460   | 6 - PAGE   | 3 - S  | Lock:Released |
...

Locks and UPDATE

Salient observations in the UPDATE:

  • Since the data is fetched in order the update is also performed in order.
  • The same Page is accessed. This continues until all records are updated and move on to the next page.
  • The locks here are on Object, Page, and Key (the index). 
  • All Key and Page locks are released at the end of the transaction when the data is committed. 
  • There is a very small chance of lock.
+-------------------------+------------+----------+---------------+
| TextData                | Type       | Mode     | EventClass    |
+-------------------------+------------+----------+---------------+
| UPDATE dbo.UR_TBL SET…  |            |          |               |
| NULL                    | 5 - OBJECT | 8 - IX   | Lock:Acquired |
| 1:2456                  | 6 - PAGE   | 8 - IX   | Lock:Acquired |
| (8194443284a0)          | 7 - KEY    | 5 - X    | Lock:Acquired |
| (8194443284a0)          | 7 - KEY    | 0 - NULL | Lock:Released |
| 1:2456                  | 6 - PAGE   | 0 - NULL | Lock:Released |
+-------------------------+------------+----------+---------------+
| UPDATE dbo.UR_TBL SET…  |            |          |               |
| 1:2456                  | 6 - PAGE   | 8 - IX   | Lock:Acquired |
| (61a06abd401c)          | 7 - KEY    | 5 - X    | Lock:Acquired |
| (61a06abd401c)          | 7 - KEY    | 0 - NULL | Lock:Released |
| 1:2456                  | 6 - PAGE   | 0 - NULL | Lock:Released |
+-------------------------+------------+----------+---------------+
...
+-------------------------+------------+----------+---------------+
...
| n:nnnnnn:nn             | 7 - KEY    | 5 - X    | Lock:Released |
| n:nnnnnn:nn             | 7 - KEY    | 5 - X    | Lock:Released |
| n:nnnnnn                | 6 - PAGE   | 8 - IX   | Lock:Released |
|                         | 5 - OBJECT | 8 - IX   | Lock:Released |
+-------------------------+------------+----------+---------------+

Page Access and Masking operation

The example below shows a simplified view of how pages are accessed.

  • Step 0 - The masking job has started and the engine is starting to fetch 20,000 rows (default). MS SQL will only lock two pages (darker green) when reading (overlapping) and then released (light green).
  • Step 1 - The fetched records are starting to buffer (grey) in the Masking Engine (to optimise performance). The UPDATE is starting as soon as records are masked. This will be done in the order the rows were read (hence per page). 
  • Step 2 - The buffers are full and the fetch of records in the SELECT are paused. More records and pages are updated (and locked). 
  • Step 3 - When buffers come under a certain limit, more records are fetched. 
  • Step 4 - When the Commit Size is reached - the updated rows (pages) are committed. Another fetch has also happened here.
  • This process keeps going untill the end. 

KBA6319 - Clustered Index Pages.png

Since the Clustered Index reads the data per page it's easier to draw this example. 

Page Access and Masking operation

The example below shows a simplified view of how pages are accessed when there is a WHERE clause.

  • Step 0 - The masking job has started and the engine is starting to fetch 20,000 rows (default). Since it is not known which record matches the WHERE clause, all records are read but there are on 50% match so twice the number of pages needs to be read to reach 20,000 rows. MS SQL will only lock two pages (darker green) when reading (overlapping) and then released (light green).
  • Step 1 - The fetched records are starting to buffer (grey) in the Masking Engine (to optimise performance). The UPDATE is starting as soon as records are masked. This will be done in the order the rows were read (hence per page). 
    • There is an example here where a page did not have any matching (WHERE) records and hence - records in this page are not loaded into the Masking Engine. 
  • Step 2 - The buffers are full and the fetch of records in the SELECT are paused. The double amount of pages are read in order to fill the bigger. Also, more records and pages are updated (and locked). 
  • Step 3 - When buffers come under a certain limit, more records are fetched. 
  • Step 4 - When the Commit Size is reached - the updated rows (pages) are committed. Another fetch has also happened here. Note that on average the double amount of pages accessed. 
  • This process keeps going until the end. 

KBA6319 - Clustered Index w WHERE Pages.png

Nonclustered Index

Salient observations in the SELECT:

  • The fetched records are scattered from different pages (the fetch order is from the nonclustered index).
  • The same page is read many times. One page is even read again immediately following the first read.
  • The first page acquired is released much further down (not even in the list). 
+------------------+------------+--------+---------------+
| TextData         | Type       | Mode   | EventClass    |
+--------------------------------------------------------+
| SELECT "Mask", "DMS_ROW_ID" FROM "dbo"."YOUR_TABLE"    |
|                  | 5 - OBJECT | 6 - IS | Lock:Acquired |
| 1:338640         | 6 - PAGE   | 3 - S  | Lock:Acquired |
|     1:341089     | 6 - PAGE   | 3 - S  | Lock:Acquired |
|     1:341089     | 6 - PAGE   | 3 - S  | Lock:Released |
|       1:340565   | 6 - PAGE   | 3 - S  | Lock:Acquired |
|       1:340565   | 6 - PAGE   | 3 - S  | Lock:Released |
|       1:340565   | 6 - PAGE   | 3 - S  | Lock:Acquired |
|       1:340565   | 6 - PAGE   | 3 - S  | Lock:Released |
|         1:340953 | 6 - PAGE   | 3 - S  | Lock:Acquired |
|         1:340953 | 6 - PAGE   | 3 - S  | Lock:Released |
|     1:341089     | 6 - PAGE   | 3 - S  | Lock:Acquired |
|     1:341089     | 6 - PAGE   | 3 - S  | Lock:Released |
|       1:340565   | 6 - PAGE   | 3 - S  | Lock:Acquired |
|       1:340565   | 6 - PAGE   | 3 - S  | Lock:Released |
|         1:340953 | 6 - PAGE   | 3 - S  | Lock:Acquired |
|         1:340953 | 6 - PAGE   | 3 - S  | Lock:Released |
|       1:340565   | 6 - PAGE   | 3 - S  | Lock:Acquired |
|       1:340565   | 6 - PAGE   | 3 - S  | Lock:Released |
|     1:341089     | 6 - PAGE   | 3 - S  | Lock:Acquired |
|     1:341089     | 6 - PAGE   | 3 - S  | Lock:Released |
|         1:340953 | 6 - PAGE   | 3 - S  | Lock:Acquired |
|         1:340953 | 6 - PAGE   | 3 - S  | Lock:Released |
|       1:340565   | 6 - PAGE   | 3 - S  | Lock:Acquired |
|       1:340565   | 6 - PAGE   | 3 - S  | Lock:Released |
...

Salient observations in the UPDATE:

  • The data updated in the same order as they are fetched.
  • This means the update is also scattered. 
  • The locks here are on Object, Page, RID (Row ID), and Key (the index). 
  • All RID and Page locks are released at the end of the transaction when the data is committed. 
  • There is an increased risk lock.
+-------------------------+------------+----------+---------------+
| TextData                | Type       | Mode     | EventClass    |
+-------------------------+------------+----------+---------------+
| UPDATE dbo.UR_TBL SET…  |            |          |               |
|                         | 5 - OBJECT | 8 - IX   | Lock:Acquired |
| 1:355750                | 6 - PAGE   | 7 - IU   | Lock:Acquired |
| (7fbb0d81d998)          | 7 - KEY    | 4 - U    | Lock:Acquired |
| 1:338956                | 6 - PAGE   | 7 - IU   | Lock:Acquired |
| 1:338956:89             | 9 - RID    | 4 - U    | Lock:Acquired |
| 1:338956                | 6 - PAGE   | 8 - IX   | Lock:Acquired |
| 1:338956:89             | 9 - RID    | 5 - X    | Lock:Acquired |
| 1:338956:89             | 9 - RID    | 0 - NULL | Lock:Released |
| 1:338956                | 6 - PAGE   | 0 - NULL | Lock:Released |
| (7fbb0d81d998)          | 7 - KEY    | 4 - U    | Lock:Released |
| 1:355750                | 6 - PAGE   | 7 - IU   | Lock:Released |
+-------------------------+------------+----------+---------------+
| UPDATE dbo.UR_TBL SET…  |            |          |               |
| 1:354672                | 6 - PAGE   | 7 - IU   | Lock:Acquired |
| (179176a0f183)          | 7 - KEY    | 4 - U    | Lock:Acquired |
| 1:338640                | 6 - PAGE   | 7 - IU   | Lock:Acquired |
| 1:338640:1              | 9 - RID    | 4 - U    | Lock:Acquired |
| 1:338640:1              | 9 - RID    | 5 - X    | Lock:Acquired |
| 1:338640:1              | 9 - RID.   | 0 - NULL | Lock:Released |
| 1:338640                | 6 - PAGE.  | 0 - NULL | Lock:Released |
| (179176a0f183)          | 7 - KEY    | 4 - U    | Lock:Released |
| 1:354672                | 6 - PAGE   | 7 - IU   | Lock:Released |
+-------------------------+------------+----------+---------------+
...
+-------------------------+------------+----------+---------------+
...
| n:nnnnnn:nn             | 9 - RID    | 5 - X    | Lock:Released |
| n:nnnnnn:nn             | 9 - RID    | 5 - X    | Lock:Released |
| n:nnnnnn                | 6 - PAGE   | 8 - IX   | Lock:Released |
|                         | 5 - OBJECT | 8 - IX   | Lock:Released |
+-------------------------+------------+----------+---------------+

Page Access and Masking operation

The example below shows a simplified view of how pages are accessed when the data is read out of order and there is a lock.

Note: A NonClustered index might read the data out of order. NonClustered Indexes can read data in different orders. This example is an illustration of out of order read. Some reads will acquire different types of locks. 

 

  • Step 0 - The masking job has started and the engine is starting to fetch 20,000 rows (default). Since the data is read out of order -multiple pages need to be accessed. Also, in this example, there is only one Page lock acquired at any time (darker green) and then released (light green). The same page can be accessed multiple times. 
  • Step 1 - The fetched records are starting to buffer (grey) in the Masking Engine. The UPDATE is starting as soon as records are masked. To show this, a separate row has been drawn.
    • The Update is done in the order they are read and hence the updated pages are also scattered.
  • Step 2 - The buffers are full and the fetch of records in the SELECT are paused. More records and pages are updated (and locked). 
  • Step 3 - When buffers come under a certain limit, more records are fetched. 
    • This is now when we are at risk to have locks. There are a number of Page locks due to UPDATE and they are scattered. The masking engine needs to fetch more records and these are also scattered. 
    • If a page needs to be accessed (x) to fetch data that already has a lock due to the UPDATE - the process locks.

KBA6319 - Out Of Order and Lock Pages.png

Nonclustered Index and WHERE

When there is a WHERE clause in the SELECT statement. The same number of pages is read since the page needs to be fetched to read the record and apply the WHERE clause (this might change if an index is used on the filtered column). 

The number of records fetched into the masking engine for masking is not less (due to the filter) but when updating each transaction will still contain the number of records defined in Commit Size, hence an even higher risk to have a lock (or deadlock). 

This can be seen in the Clustered Index example above. 

 

Good query 

SELECT
    OBJECT_NAME(ps.object_id) as TableName,
    i.name as IndexName,
    ps.index_type_desc,
    ps.page_count,
    ps.avg_fragmentation_in_percent,
    ps.forwarded_record_count
FROM sys.dm_db_index_physical_stats (DB_ID(), NULL, NULL, NULL, 'DETAILED') AS ps
INNER JOIN sys.indexes AS i
    ON ps.OBJECT_ID = i.OBJECT_ID  
    AND ps.index_id = i.index_id
WHERE OBJECT_NAME(ps.object_id) = 'Demo'