Skip to main content
Delphix

MS SQL Server Masking and Blocks (Deadlocks) (KBA6319)

 

KBA

KBA# 6319

 

At a Glance   

Description: This article details possible root causes and possible solutions to Blocks and Deadlocks on MS SQL Server. 
Root cause: On MS SQL Server, each action on the database is protected against 'dirty' data by setting locks on objects. This is a database operation and a hang (or deadlock) happens when a lock is taken on the UPDATE  that is not compatible with the lock taken on the SELECT.

When a lock happens, the data from the SELECT is fetched in a random order causing a scattered read of the data pages for the table. 
Best practice: There are some best practices for an MS SQL In-Place masking job. These points will reduce the chance of encountering a hang (or deadlock). 


The SELECT statement fetching the data will define the order in which data is read. The best option is when all rows in a page are fetched and then moved on to the next page (accessing a page only once). A Clustered Index Scan will do this and generally also a Table Scans  (though we have seen scattered reads on Table Scans). Therefore, consider the following: 
  1. If possible use a Clustered Index as the Unique Row Identifier (Key). If existing the Masking Engine will automatically use this as the key. 
  2. Limit the use of Custom SQL which could change the order the data is read. This includes: 
    • Avoid UNION.
    • Avoid ORDER BY.
  3. Ensure that there is no external transaction running which will change the data on the masked table. This includes Triggers (which should be disabled in this case). 
How to identify a hang: If a masking job hangs, use the following MS SQL command (and look for UPDATE and SELECT from the masked table):

        sp_lock
        Current SQL



See below for more details and additional queries.
How to identify a deadlock: MS SQL Server has a deadlock reporting tool. Go to "Management" > "Extended Events" > "Sessions" > "system_health" > "package0.event_file" and click "View Target Data…".

See below for more details and additional queries.
Solutions: Below are three possible solutions listed: 
  1. Use where possible a Clustered Index as the URI. Primary Keys are normally Clustered Indexes
  2. Try to avoid using 'WHERE' and 'JOIN' clauses on Heap Tables and when the PK is a NonClustered Index. 
  3. Do not use ORDER BY.
  4. Limit the number of UPDATE page locks by reducing the COMMIT SIZE (try 1,000 and then 100).

If a 'WHERE' clause is needed, then try removing any index on the columns in the WHERE. 

Applicable Delphix Versions

Click here to view the versions of the Delphix engine to which this article applies
Major Release All Sub Releases
6.0 6.0.0.0, 6.0.1.0, 6.0.1.1, 6.0.2.0

5.3

5.3.0.0, 5.3.0.1, 5.3.0.2, 5.3.0.3, 5.3.1.0, 5.3.1.1, 5.3.1.2, 5.3.2.0, 5.3.3.0, 5.3.3.1, 5.3.4.0, 5.3.5.0 5.3.6.0, 5.3.7.0, 5.3.7.1, 5.3.8.0, 5.3.8.1, 5.3.9.0

5.2

5.2.2.0, 5.2.2.1, 5.2.3.0, 5.2.4.0, 5.2.5.0, 5.2.5.1, 5.2.6.0, 5.2.6.1

5.1

5.1.0.0, 5.1.1.0, 5.1.2.0, 5.1.3.0, 5.1.4.0, 5.1.5.0, 5.1.5.1, 5.1.6.0, 5.1.7.0, 5.1.8.0, 5.1.8.1, 5.1.9.0, 5.1.10.0

5.0

5.0.1.0, 5.0.1.1, 5.0.2.0, 5.0.2.1, 5.0.2.2, 5.0.2.3, 5.0.3.0, 5.0.3.1, 5.0.4.0, 5.0.4.1 ,5.0.5.0, 5.0.5.1, 5.0.5.2, 5.0.5.3, 5.0.5.4

4.3

4.3.1.0, 4.3.2.0, 4.3.2.1, 4.3.3.0, 4.3.4.0, 4.3.4.1, 4.3.5.0

4.2

4.2.0.0, 4.2.0.3, 4.2.1.0, 4.2.1.1, 4.2.2.0, 4.2.2.1, 4.2.3.0, 4.2.4.0 , 4.2.5.0, 4.2.5.1

4.1

4.1.0.0, 4.1.2.0, 4.1.3.0, 4.1.3.1, 4.1.3.2, 4.1.4.0, 4.1.5.0, 4.1.6.0

Locks when masking an MS SQL Server Table

One of the harder issues to resolve is when an MS SQL Server Table creates a hung masking job due to a lock. If MS SQL Server resolves this, by killing one of the processes, the lock is called a deadlock.

In masking, the error happens since the lock on the UPDATE  is not compatible with the lock taken on the SELECT.

On normal masking executions, this should not cause an issue. One possible reason for this issue is if the rows are not read in the order they are stored in the data pages. 

This KBA looks into:

  • Types of locks.
  • Why they happen.
  • How to troubleshoot.
  • How to resolve.

Animation of Masking Job and MS SQL Locks

The animation below shows a normal masking job and how the locks are successfully taken and released. 

The job is masking a column called 'Mask' using a Unique Row Identifier (URI) called 'ID'. The ID column is a Primary Key and is Indexed using a Clustered Index

The two page-locks below (X) and (S) are not compatible. This is acceptable due to the following:

  • The locks are not taken on the same page at the same time.
    • The SELECT locks are very brief and on two pages at the same time (due to overlap read).
    • The UPDATE does lock pages for the duration to update 10,000 (Commit Size) records. 
  • Few (as few as possible) numbers of locks are taken - locks seen are (KEY, PAG, TAB). 

Note: locks happens when:

  • The record order of the fetch is scattered (i.e. due to the order in the index) resulting in scattered page locks on SELECT. 
  • The updated pages are then also scattered. 
  • A larger number of locks are taken - examples of locks types are RID, KEY, PAGE, and TAB.
  • The number of pages locked increases when WHERE is used. 
  • This can be seen in the example further down on this page where the index is a Nonclustered Index. 

 

MS SQL Moving v5 (fast input w gauge).gif

Locks and Deadlocks

There are two possible scenarios: locks and deadlock. 

Masking job is hung due to lock

The most common lock type here is; 

LCK_M_IX: this wait type occurs when a thread is waiting for a table or page IX lock so that a row insert or update can occur. 

The lock is usually on a PAGE (from MS SQL). 

lastwaittype  | LCK_M_IX                        
waitresource  | PAG: 9:21:47483425   

The Masking Logs will not show any error here. The masked row count will though stop at '(COMMIT_SIZE x n) - 1'. For example 9,999. 

The Masking Logs will just stop (the job will here show 14,9999 rows masked) (only Table input and Update shown).  

[JOB_ID_XX_YY] 2020/07/11 20:11:44 - Table input.0 - linenr 50000
[JOB_ID_XX_YY] 2020/07/11 20:11:46 - Table input.0 - linenr 100000
[JOB_ID_XX_YY] 2020/07/11 20:11:47 - Table input.0 - linenr 150000
[JOB_ID_XX_YY] 2020/07/11 20:11:51 - Update.0 - linenr 50000
[JOB_ID_XX_YY] 2020/07/11 20:11:51 - Table input.0 - linenr 200000
[JOB_ID_XX_YY] 2020/07/11 20:11:55 - Table input.0 - linenr 250000
[JOB_ID_XX_YY] 2020/07/11 20:11:56 - Update.0 - linenr 100000
[JOB_ID_XX_YY] 2020/07/11 20:12:00 - Table input.0 - linenr 300000

Masking job deadlocked

A deadlock is when MS SQL Server's deadlock monitor thread has identified an unresolvable block and terminates one of the processes. This will resolve the lock but from a masking job's point of view, the job has terminated.

The error seen in the Masking Logs is something like this: 

Transaction (Process ID nnn) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.

How to resolve

To solve these issues, the main things are to ensure that the order is optimized and to limit the number of locked pages. 

The recommendations are: 

  • Use where possible a Clustered Index as the URI (Primary Keys are normally Clustered Indexes).  
    • If you cannot use a clustered index, you can perform a one-time simulation of the clustered index by performing a table rebuild.  (ALTER TABLE <table_name> REBUILD;)  A rebuild will perform a one-time (on disk) re-write of the existing rows in sorted order.  A clustered index does the same, but maintains the row order as changes occur to the table.  A rebuild right before you kick-off a masking job will allow that job to succeed if the problem is blocking.
  • Try to avoid the use of 'WHERE' clauses. 
  • Do not use ORDER BY.
  • Limit the number of UPDATE page locks by reducing the COMMIT SIZE (try 1,000 and then 100).
    • This will cause the masking job to be slower.

Why do lock issues happen?

We do not know all reasons for why lock issues happen and the exact mechanics. What we know is that for a lock to happen, the data needs to be fetched from pages in random order. Also, a good understanding of how MS SQL Server is working and its internal structures will help to sort out what is happening. 

Here are the reasons why the lock can happen. 

Type of INDEX and Heap Table

The main cause is the INDEX on the Unique Row Indenter (URI) used. 

  • If the URI is indexed using a Nonclustered Index - this could cause the records and pages read to be scattered when accessed (and updated)
  • If the table is a Heap table and there is no PK the masking engine will create DMS_ROW_ID and to reduce the time it takes and to reduce the number of changes on the database this index is a Nonclustered Index. 

WHERE clause in the Rule Set SQL

It should be ok to use a WHERE Clause if the URI is a Clustered Index.

If the URI is a Nonclustered Index, the WHERE will increase the number of pages locked in the UPDATE and SELECT will access all records in the table. This will increase the risk of locks.

Furthermore, WHERE classes could change the execution plan, especially if there is an index on the columns used in the WHERE clause.

Parallel processing on SELECT

If SQL Server has access to multiple threads the database might use multiple threads to fetch the data. This means that different pages will be accessed and the fetch can be out of order which in the end can result in a lock with the UPDATE.

ORDER BY in the Rule Set SQL

This is not common since a masking job should not have an ORDER BY statement in the SQL. 

The reason here is that the ORDER BY (as it implies) will dictate the order in which the data is read. This will result in multiple blocks to have Shared Lock. 

Row count executed while table being masked

Before the masking job starts, the masking engine will execute a SELECT COUNT query on all tables in the masked Rule Set. Most of the time, the result of this is returned before the job starts.

It has been noticed that the SELECT COUNT has been taken a long time and that the query is running while the masking job is masking the same table. This will add to the number of Page locks and increase the risk of locks. 

Lock Escalation

Lock escalation is MS SQL Server's process of converting many fine-grained locks (such as page locks) into table locks. MS SQL Server will dynamically determine when to perform a lock escalation.

Note: It is not known if MS SQL Server will ever cause an escalation when masking data. This should not happen as there are separate connections between UPDATE and SELECT. 

MS SQL database feature that could affect locks

There are some technologies that could be adding to the root cause. These are: 

  • Parallel processing (mentioned above).
  • Multiple concurrent executions and "merry-go-round scanning".
  • B-Tree structure and IAM pages. 
  • Filtering and "predicate pushdown".

Troubleshooting

In order to understand why a lock or deadlock happened, we need to collect data from the database. The best time to do this is when the lock is happening.

Use the SQL commands below to collect data: 

Tasks - specific to Deadlock

When a deadlock happens there are no active sessions left on the database.

To find deadlocks, use the MS SQL Serverdeadlock reporting tool.

  • Go to Management > Extended Events > Sessions > system_health > package0.event_file 
  • Click View Target Data….
  • Note: the recent items are at the end of the list. 

Example - Server Deadlock Reporting Tool

xml_report

<deadlock>
    <victim-list>
        <victimProcess id="process1c190257468"/>
    </victim-list>
    <process-list>
        <process id="process1c190257468" taskpriority="0" logused="0" waitresource="PAGE: 7:1:13463 " waittime="15389" ownerId="324962" transactionname="implicit_transaction" lasttranstarted="2020-10-20T21:04:35" XDES="0x1c19ebe8428" lockMode="S" schedulerid="1" kpid="7724" status="suspended" spid="57" sbid="0" ecid="0" priority="0" trancount="1" lastbatchstarted="2020-10-20T21:04:34.963" lastbatchcompleted="2020-10-20T21:04:34.957" lastattention="1900-01-01T00:00:00.957" clientapp="Microsoft JDBC Driver for SQL Server" hostname="YourME" hostpid="0" loginname="user" isolationlevel="read committed (2)" xactid="324962" currentdb="7" currentdbname="Test" lockTimeout="4294967295" clientoption1="671088672" clientoption2="128058">
            <executionStack>
                <frame procname="adhoc" line="1" stmtend="86" sqlhandle="0x0200000048e01434b96494e5263ed6bc0a64fdc729f06e2d0000000000000000000000000000000000000000">  unknown    
       </frame>
            </executionStack>
            <inputbuf>  SELECT &quot;Mask&quot; , &quot;ID&quot; FROM &quot;dbo&quot;.&quot;Heap_2_Col&quot;   
     </inputbuf>
        </process>
        <process id="process1c1909b6108" taskpriority="0" logused="2594536" waitresource="PAGE: 7:1:10106 " waittime="3511" ownerId="326162" transactionname="implicit_transaction" lasttranstarted="2020-10-20T21:14:31.797" XDES="0x1c195df0428" lockMode="IX" schedulerid="1" kpid="5932" status="suspended" spid="60" sbid="0" ecid="0" priority="0" trancount="2" lastbatchstarted="2020-10-20T21:14:31.767" lastbatchcompleted="2020-10-20T21:14:31.677" lastattention="1900-01-01T00:00:00.677" clientapp="Microsoft JDBC Driver for SQL Server" hostname="YourME" hostpid="0" loginname="user" isolationlevel="read committed (2)" xactid="326162" currentdb="7" currentdbname="Test" lockTimeout="4294967295" clientoption1="671088672" clientoption2="128058">
            <executionStack>
                <frame procname="adhoc" line="1" stmtstart="60" stmtend="192" sqlhandle="0x020000009fb93f2080f3c22eb583aea908559018e477dd910000000000000000000000000000000000000000">  unknown    
     </frame>
            </executionStack>
            <inputbuf>  (@P0 varchar(8000),@P1 bigint)UPDATE &quot;dbo&quot;.&quot;Heap_2_Col&quot; SET &quot;Mask&quot; = @P0 WHERE  ( ( ID = @P1  ) )    </inputbuf>
        </process>
    </process-list>
    <resource-list>
        <pagelock fileid="1" pageid="13463" dbid="7" subresource="FULL" objectname="Test.dbo.Heap_2_Col" id="lock1c17700a900" mode="IX" associatedObjectId="72057594044350464">
            <owner-list>
                <owner id="process1c1909b6108" mode="IX"/>
            </owner-list>
            <waiter-list>
                <waiter id="process1c190257468" mode="S" requestType="wait"/>
            </waiter-list>
        </pagelock>
        <pagelock fileid="1" pageid="10106" dbid="7" subresource="FULL" objectname="Test.dbo.Heap_2_Col" id="lock1c177269580" mode="SIU" associatedObjectId="72057594044350464">
            <owner-list>
                <owner id="process1c190257468" mode="S"/>
            </owner-list>
            <waiter-list>
                <waiter id="process1c1909b6108" mode="IX" requestType="convert"/>
            </waiter-list>
        </pagelock>
    </resource-list>
</deadlock>

Example - Trace from the deadlock

The following example was captured from a SQL Trace. A trace needs to be captured while the deadlock happens. Please see below how to capture a trace.

Look for the following EventClasses:

  • 25 (Lock: Deadlock event) - Deadlock event.
  • 26 (Lock: Cancel event) - Lock acquisition has been canceled (for example, due to deadlock). 
  • 27 (Lock: Timeout) - Lock timed out due to blocking lock on a required resource. 
  • 59 (Lock: Deadlock Chain) - Chained events leading up to the deadlock.
  • 148 (Deadlock Graph) - XML description of the deadlock.

 

Note: The XML used to create this graph is the same as the XML above. There might be a tool to create these outside the trace tool. 

KBA6319 - Deadlock Trace.png

Tasks - specific to Hangs

SQL 1 - Current locks

Run this MS Store Procedure from the MS SSMS. The command will show the current locks: 

sp_lock

Example Output

The example below is from a masking job with DMS_ROW_ID and shows: 

  • spid: The session ID form the process.
  • dbid: The database ID. 
  • ObjId: The database object ID.
  • IndId: The index ID. 
  • Type: The lock type.
  • Resource: The resource that is locked. 
  • Mode: The lock mode. 
  • Status: The lock status, CNVRT = Being converted, GRANT = granted, WAIT = is waiting for another process. 

For details see: https://docs.microsoft.com/en-us/sql...l-server-ver15 

+------+-------+------------+-------+------+-------------+-------+--------+
| spid | dbid  | ObjId      | IndId | Type | Resource    | Mode  | Status |
+------+-------+------------+-------+------+-------------+-------+--------+
| 54   | 7     | 0          | 0     | DB   |             | S     | GRANT  |
| 54   | 32767 | -571204656 | 0     | TAB  |             | Sch-S | GRANT  |
| 54   | 1     | 1467152272 | 0     | TAB  |             | IS    | GRANT  |
| 54   | 7     | 0          | 0     | DB   |             | S     | GRANT  |
| 54   | 7     | 0          | 0     | DB   |             | S     | GRANT  |
| 54   | 7     | 0          | 0     | DB   |             | S     | GRANT  |
+------+-------+------------+-------+------+-------------+-------+--------+
| 57   | 7     | 1330103779 | 0     | RID  | 1:339689:45 | X     | GRANT  |
| 57   | 7     | 1330103779 | 0     | RID  | 1:339688:45 | X     | GRANT  |
| 57   | 7     | 1330103779 | 0     | RID  | 1:339687:61 | X     | GRANT  |
...
+------+-------+------------+-------+------+-------------+-------+--------+

Filtering the result, it is possible to find where two spids are locking on the same page:

+---------+----+----+----+----+----+----+
| Page    | 51 | 54 | 55 | 56 | 57 | 58 |
+---------+----+----+----+----+----+----+
| 1:10371 |    | 1  |    |    | 3  |    |
+---------+----+----+----+----+----+----+

SQL 2 - Current SQL 

Run this SQL from MS SSMS. It shows currently running SQL processes coming from a JDBC connection (that is, the Masking Engine). 

SELECT (SELECT [text] FROM 
   sys.dm_exec_sql_text(sql_handle)) AS SQLCommand,
   spid AS SPID, 
   status AS Status,
   hostname AS Hostname, 
   hostprocess AS HostProcess, 
   open_tran AS OpenTrans, 
   cmd AS Command,
   blocked AS Blocked, 
   CONVERT(VARCHAR(19), waittime) AS WaitTime,
   CASE waittype WHEN 0x0000 THEN SPACE(10) ELSE waitresource END as Waiting, 
   login_time AS LoginTime
FROM 
   sys.sysprocesses WITH (NOLOCK)
WHERE
   program_name = 'Microsoft JDBC Driver for SQL Server'

Example Output

The example below shows a lock between UPDATE and SELECT. The UPDATE lock is older and the SELECT has recently been taken.

  • SQLCommand: The active SQL commands. 
  • SPID: The session ID form the process.
  • Status: Current status. The process below is healthy. SELECT is sleeping since buffers are full and UPDATE is running. 
  • Hostname: The host issued the SQL commands (should be the masking engine and nothing else).
  • HostProcess: Workstation process ID number.
  • OpenTrans: Number of open transactions for the process (should be 1).
  • Command: Internal SQL Server Command currently being executed.
  • Blocked: 0 (or NULL) = not blocked. -2 = blocked by orphaned trans. -3 = blocked by deferred recovered trans. 
  • WaitTime: Current wait time in milliseconds
  • Waiting: Textual representation of a lock resource (no lock in this example).
  • LoginTime: Time this client process was logged in.

For more details see: https://docs.microsoft.com/en-us/sql...l-server-ver15

+------------------------------------------------------------+------+-----------+-------------+-------------+-----------+---------+---------+----------+----------------+-------------------------+
| SQLCommand                                                 | SPID | Status    | Hostname    | HostProcess | OpenTrans | Command | Blocked | WaitTime | Waiting        | LoginTime               |
+------------------------------------------------------------+------+-----------+-------------+-------------+-----------+---------+---------+----------+----------------+-------------------------+
| SELECT "Mask" , "DMS_ROW_ID" FROM "dbo"."YOUR_TABLE"       | 54   | suspended | YOUR_ME.com | 0           | 1         | SELECT  | 0       | 849      |                | 2020-09-28 17:58:56.077 |
| (@P0 varchar(8000),@P1 bigint)UPDATE "dbo"."YOUR_TABLE"... | 57   | suspended | YOUR_ME.com | 0           | 2         | UPDATE  | 67      | 394723   | PAG: 7:1:10371 | 2020-09-28 17:58:56.077 |
+------------------------------------------------------------+------+-----------+-------------+-------------+-----------+---------+---------+----------+----------------+-------------------------+

Tasks - all locks

SQL 1 - Table Metadata

Run this SQL from MS SSMS. Change the table name 'YOUR_TABLE' to the table you want to investigate. It will show table metadata.

SELECT 
    t.NAME AS TableName,
    p.rows AS RowCounts,
    SUM(a.total_pages) AS TotalPages, 
    SUM(a.used_pages) AS UsedPages, 
    (SUM(a.total_pages) - SUM(a.used_pages)) AS UnusedPages,
p.rows/SUM(a.used_pages) AS RowsPerPage,
SUM(a.used_pages)*10000/p.rows AS PagesPerCommit
FROM 
    sys.tables t
INNER JOIN      
    sys.indexes i ON t.OBJECT_ID = i.object_id
INNER JOIN 
    sys.partitions p ON i.object_id = p.OBJECT_ID AND i.index_id = p.index_id
INNER JOIN 
    sys.allocation_units a ON p.partition_id = a.container_id
WHERE 
    i.OBJECT_ID > 255 
AND t.NAME = 'YOUR_TABLE' 
GROUP BY 
    t.Name, p.Rows
ORDER BY 
    t.Name

Example Output

The example below shows.

  • RowCount: Total number of rows in the table.
  • TotalPages: Total number of Pages assigned for the table. 
  • UsedPages: The number of Pages used by the table.
  • UnusedPages: The number of Pages currently not used. 
  • RowsPerPage: The average number of Rows per Page (if this number is small then the database will need to read many pages to get to and update the data which will affect the performance).
  • PagesPerCommit: The average number of pages updated on a standard COMMIT SIZE (10,000). This value is related to 'RowsPerPage'. A larger number here will affect the performance as UPDATE will need to update more pages.  
+------------+----------+------------+-----------+-------------+-------------+----------------+
| TableName  | RowCount | TotalPages | UsedPages | UnusedPages | RowsPerPage | PagesPerCommit |
+------------+----------+------------+-----------+-------------+-------------+----------------+
| YOUR_TABLE | 509223   | 1937       | 1933      | 4           | 263         | 37             |
+------------+----------+------------+-----------+-------------+-------------+----------------+

SQL 2 - Indexes

Run this SQL from MS SSMS. Change the table name 'YOUR_TABLE' to the table you want to investigate. 

 

Note

Notes:

  • The 'like' and '%', alternatively, include the schema.
  • Indexes are removed and added as part of the masking job and might not be in the list.

 

select 
    schema_name(o.schema_id) + '.' + o.name as 'Table_View', 
    case when o.type = 'U' then 'Table'
        when o.type = 'V' then 'View'
        end as 'Object type',
    c.name 'Column name',
    t.Name 'Data type',
    c.is_identity as 'Ident',
    c.is_computed as 'Comp',
    i.[name] as 'Index name',
    i.object_id as "Index ObjID",
    case when i.type = 1 then 'Clustered index'
        when i.type = 2 then 'Nonclustered unique index'
        when i.type = 3 then 'XML index'
        when i.type = 4 then 'Spatial index'
        when i.type = 5 then 'Clustered columnstore index'
        when i.type = 6 then 'Nonclustered columnstore index'
        when i.type = 7 then 'Nonclustered hash index'
        end as 'Index type',
    case when i.is_unique = 1 then 'Yes'
        else 'No' end as 'Unique'
from 
    sys.objects o
inner join 
    sys.indexes i on o.object_id = i.object_id
inner join 
    sys.index_columns ic on ic.object_id = o.object_id and ic.index_id = i.index_id
inner join
    sys.columns c on ic.object_id =  c.object_id and ic.column_id = c.column_id
INNER JOIN 
    sys.types t ON c.user_type_id = t.user_type_id
where o.is_ms_shipped <> 1
and i.index_id > 0
and o.name like '%YOUR_TABLE'

Example Output

The example below shows.

  • Object type: The type of object (Table or View - this should be Table). 
  • Column name: The name of the indexed column.
  • Data type: What is the data type of the column?
  • Ident: Is this an Identity column? 
  • Comp: Is this a Computed column?
  • Index name: The name of the index.
  • Index type: Type of index.
  • Unique: Is there a unique constraint on the index?
+----------------+-------------+-------------+-----------+-------+------+--------------+---------------------------+--------+
| Table_View     | Object_type | Column Nnme | Data type | Ident | Comp | Index name   | Index type                | Unique |
+----------------+-------------+-------------+-----------+-------+------+--------------+---------------------------+--------+
| dbo.YOUR_TABLE | Table       | DMS_ROW_ID  | int       | 1     | 0    | i_DMS_ROW_ID | Nonclustered unique index | No     |
+----------------+-------------+-------------+-----------+-------+------+--------------+---------------------------+--------+

Task 3 - Table Describe 

Use Management Studio to 'Script Table as...' and paste the table and index definitions into your support case.

Task 4 - Execution Plan 

Get the Execution Plan for the query and investigate:

  • Type of Scan (Table or Index)
  • Number of Threads.
  • Type of  

Task 5 - Capture MS SQL Profile 

To collect data when the lock is happening - the optimal tool is to capture a SQL profile. Using the template below will capture all relevant lock types, modes, and data. 

Use this Template file to capture a trace of the Masking Job.


Delphix_InvestigateLocks.tdf

 

More SQL Queries: 

SELECT
    OBJECT_NAME(ps.object_id) as TableName,
    i.name as IndexName,
    ps.index_type_desc,
    ps.page_count,
    ps.avg_fragmentation_in_percent,
    ps.forwarded_record_count
FROM sys.dm_db_index_physical_stats (DB_ID(), NULL, NULL, NULL, 'DETAILED') AS ps
INNER JOIN sys.indexes AS i
    ON ps.OBJECT_ID = i.OBJECT_ID  
    AND ps.index_id = i.index_id
WHERE OBJECT_NAME(ps.object_id) = 'Demo'

 

Warning

Warning:

These traces can be extremely large.

 

Example - Saved as Table

The example below shows the locks taken on SELECT COUNT (this is in most of the cases the same as the just a SELECT) (the difference is that SELECT COUNT will have an aggregate and return one row). The Index in this example is Clustered. 

KBA SSMS - SQL Profile Example.png

Sample locks

Below are two examples from masking the same data but with two different types of Indexes.

In the first example, the table has an ID column that is a Primary Key that is indexed with a Clustered Index. The second example has no Primary Key and hence the masking engine has created DMS_ROW_ID with a Nonclustered Index.

The data collected is from MS SQL Profile Traces from each execution.

Note

Note:

The Nonclustered examples below is simplified slightly. Portions of the larger transaction have not been included.

 

Clustered Index

Locks and SELECT

Salient observations in the SELECT:

  • A Page is only acquired once, and all rows in the page are read.
  • The Page locks are overlapping (likely to maximize read performance).
  • When the buffers are filled - the fetch pauses and then resumes and fetches another group of rows.
+----------+------------+--------+---------------+
| TextData | Type       | Mode   | EventClass    |
+----------+------------+--------+---------------+
| SELECT "Mask", "ID" FROM "dbo"."YOUR_TABLE"    |
|          | 5 - OBJECT | 6 - IS | Lock:Acquired |
| 1:2456   | 6 - PAGE   | 3 - S  | Lock:Acquired |
| 1:2458   | 6 - PAGE   | 3 - S  | Lock:Acquired |
| 1:2456   | 6 - PAGE   | 3 - S  | Lock:Released |
| 1:2459   | 6 - PAGE   | 3 - S  | Lock:Acquired |
| 1:2458   | 6 - PAGE   | 3 - S  | Lock:Released |
| 1:2460   | 6 - PAGE   | 3 - S  | Lock:Acquired |
| 1:2459   | 6 - PAGE   | 3 - S  | Lock:Released |
| 1:2461   | 6 - PAGE   | 3 - S  | Lock:Acquired |
| 1:2460   | 6 - PAGE   | 3 - S  | Lock:Released |
...

Locks and UPDATE

Salient observations in the UPDATE:

  • Since the data is fetched in order the update is also performed in order.
  • The same Page is accessed. This continues until all records are updated and move on to the next page.
  • The locks here are on Object, Page, and Key (the index). 
  • All Key and Page locks are released at the end of the transaction when the data is committed. 
  • There is a very small chance of lock.
+-------------------------+------------+----------+---------------+
| TextData                | Type       | Mode     | EventClass    |
+-------------------------+------------+----------+---------------+
| UPDATE dbo.UR_TBL SET…  |            |          |               |
| NULL                    | 5 - OBJECT | 8 - IX   | Lock:Acquired |
| 1:2456                  | 6 - PAGE   | 8 - IX   | Lock:Acquired |
| (8194443284a0)          | 7 - KEY    | 5 - X    | Lock:Acquired |
| (8194443284a0)          | 7 - KEY    | 0 - NULL | Lock:Released |
| 1:2456                  | 6 - PAGE   | 0 - NULL | Lock:Released |
+-------------------------+------------+----------+---------------+
| UPDATE dbo.UR_TBL SET…  |            |          |               |
| 1:2456                  | 6 - PAGE   | 8 - IX   | Lock:Acquired |
| (61a06abd401c)          | 7 - KEY    | 5 - X    | Lock:Acquired |
| (61a06abd401c)          | 7 - KEY    | 0 - NULL | Lock:Released |
| 1:2456                  | 6 - PAGE   | 0 - NULL | Lock:Released |
+-------------------------+------------+----------+---------------+
...
+-------------------------+------------+----------+---------------+
...
| n:nnnnnn:nn             | 7 - KEY    | 5 - X    | Lock:Released |
| n:nnnnnn:nn             | 7 - KEY    | 5 - X    | Lock:Released |
| n:nnnnnn                | 6 - PAGE   | 8 - IX   | Lock:Released |
|                         | 5 - OBJECT | 8 - IX   | Lock:Released |
+-------------------------+------------+----------+---------------+

Page Access and Masking operation

The example below shows a simplified view of how pages are accessed.

  • Step 0 - The masking job has started and the engine is starting to fetch 20,000 rows (default). MS SQL will only lock two pages (darker green) when reading (overlapping) and then released (light green).
  • Step 1 - The fetched records are starting to buffer (grey) in the Masking Engine (to optimise performance). The UPDATE is starting as soon as records are masked. This will be done in the order the rows were read (hence per page). 
  • Step 2 - The buffers are full and the fetch of records in the SELECT are paused. More records and pages are updated (and locked). 
  • Step 3 - When buffers come under a certain limit, more records are fetched. 
  • Step 4 - When the Commit Size is reached - the updated rows (pages) are committed. Another fetch has also happened here.
  • This process keeps going untill the end. 

KBA6319 - Clustered Index Pages.png

Clustered Index with WHERE

Since the Clustered Index reads the data per page it's easier to draw this example. 

Page Access and Masking operation

The example below shows a simplified view of how pages are accessed when there is a WHERE clause.

  • Step 0 - The masking job has started and the engine is starting to fetch 20,000 rows (default). Since it is not known which record matches the WHERE clause, all records are read but there are on 50% match so twice the number of pages needs to be read to reach 20,000 rows. MS SQL will only lock two pages (darker green) when reading (overlapping) and then released (light green).
  • Step 1 - The fetched records are starting to buffer (grey) in the Masking Engine (to optimise performance). The UPDATE is starting as soon as records are masked. This will be done in the order the rows were read (hence per page). 
    • There is an example here where a page did not have any matching (WHERE) records and hence - records in this page are not loaded into the Masking Engine. 
  • Step 2 - The buffers are full and the fetch of records in the SELECT are paused. The double amount of pages are read in order to fill the bigger. Also, more records and pages are updated (and locked). 
  • Step 3 - When buffers come under a certain limit, more records are fetched. 
  • Step 4 - When the Commit Size is reached - the updated rows (pages) are committed. Another fetch has also happened here. Note that on average the double amount of pages accessed. 
  • This process keeps going until the end. 

KBA6319 - Clustered Index w WHERE Pages.png

Nonclustered Index

Salient observations in the SELECT:

  • The fetched records are scattered from different pages (the fetch order is from the nonclustered index).
  • The same page is read many times. One page is even read again immediately following the first read.
  • The first page acquired is released much further down (not even in the list). 
+------------------+------------+--------+---------------+
| TextData         | Type       | Mode   | EventClass    |
+--------------------------------------------------------+
| SELECT "Mask", "DMS_ROW_ID" FROM "dbo"."YOUR_TABLE"    |
|                  | 5 - OBJECT | 6 - IS | Lock:Acquired |
| 1:338640         | 6 - PAGE   | 3 - S  | Lock:Acquired |
|     1:341089     | 6 - PAGE   | 3 - S  | Lock:Acquired |
|     1:341089     | 6 - PAGE   | 3 - S  | Lock:Released |
|       1:340565   | 6 - PAGE   | 3 - S  | Lock:Acquired |
|       1:340565   | 6 - PAGE   | 3 - S  | Lock:Released |
|       1:340565   | 6 - PAGE   | 3 - S  | Lock:Acquired |
|       1:340565   | 6 - PAGE   | 3 - S  | Lock:Released |
|         1:340953 | 6 - PAGE   | 3 - S  | Lock:Acquired |
|         1:340953 | 6 - PAGE   | 3 - S  | Lock:Released |
|     1:341089     | 6 - PAGE   | 3 - S  | Lock:Acquired |
|     1:341089     | 6 - PAGE   | 3 - S  | Lock:Released |
|       1:340565   | 6 - PAGE   | 3 - S  | Lock:Acquired |
|       1:340565   | 6 - PAGE   | 3 - S  | Lock:Released |
|         1:340953 | 6 - PAGE   | 3 - S  | Lock:Acquired |
|         1:340953 | 6 - PAGE   | 3 - S  | Lock:Released |
|       1:340565   | 6 - PAGE   | 3 - S  | Lock:Acquired |
|       1:340565   | 6 - PAGE   | 3 - S  | Lock:Released |
|     1:341089     | 6 - PAGE   | 3 - S  | Lock:Acquired |
|     1:341089     | 6 - PAGE   | 3 - S  | Lock:Released |
|         1:340953 | 6 - PAGE   | 3 - S  | Lock:Acquired |
|         1:340953 | 6 - PAGE   | 3 - S  | Lock:Released |
|       1:340565   | 6 - PAGE   | 3 - S  | Lock:Acquired |
|       1:340565   | 6 - PAGE   | 3 - S  | Lock:Released |
...

Salient observations in the UPDATE:

  • The data updated in the same order as they are fetched.
  • This means the update is also scattered. 
  • The locks here are on Object, Page, RID (Row ID), and Key (the index). 
  • All RID and Page locks are released at the end of the transaction when the data is committed. 
  • There is an increased risk lock.
+-------------------------+------------+----------+---------------+
| TextData                | Type       | Mode     | EventClass    |
+-------------------------+------------+----------+---------------+
| UPDATE dbo.UR_TBL SET…  |            |          |               |
|                         | 5 - OBJECT | 8 - IX   | Lock:Acquired |
| 1:355750                | 6 - PAGE   | 7 - IU   | Lock:Acquired |
| (7fbb0d81d998)          | 7 - KEY    | 4 - U    | Lock:Acquired |
| 1:338956                | 6 - PAGE   | 7 - IU   | Lock:Acquired |
| 1:338956:89             | 9 - RID    | 4 - U    | Lock:Acquired |
| 1:338956                | 6 - PAGE   | 8 - IX   | Lock:Acquired |
| 1:338956:89             | 9 - RID    | 5 - X    | Lock:Acquired |
| 1:338956:89             | 9 - RID    | 0 - NULL | Lock:Released |
| 1:338956                | 6 - PAGE   | 0 - NULL | Lock:Released |
| (7fbb0d81d998)          | 7 - KEY    | 4 - U    | Lock:Released |
| 1:355750                | 6 - PAGE   | 7 - IU   | Lock:Released |
+-------------------------+------------+----------+---------------+
| UPDATE dbo.UR_TBL SET…  |            |          |               |
| 1:354672                | 6 - PAGE   | 7 - IU   | Lock:Acquired |
| (179176a0f183)          | 7 - KEY    | 4 - U    | Lock:Acquired |
| 1:338640                | 6 - PAGE   | 7 - IU   | Lock:Acquired |
| 1:338640:1              | 9 - RID    | 4 - U    | Lock:Acquired |
| 1:338640:1              | 9 - RID    | 5 - X    | Lock:Acquired |
| 1:338640:1              | 9 - RID.   | 0 - NULL | Lock:Released |
| 1:338640                | 6 - PAGE.  | 0 - NULL | Lock:Released |
| (179176a0f183)          | 7 - KEY    | 4 - U    | Lock:Released |
| 1:354672                | 6 - PAGE   | 7 - IU   | Lock:Released |
+-------------------------+------------+----------+---------------+
...
+-------------------------+------------+----------+---------------+
...
| n:nnnnnn:nn             | 9 - RID    | 5 - X    | Lock:Released |
| n:nnnnnn:nn             | 9 - RID    | 5 - X    | Lock:Released |
| n:nnnnnn                | 6 - PAGE   | 8 - IX   | Lock:Released |
|                         | 5 - OBJECT | 8 - IX   | Lock:Released |
+-------------------------+------------+----------+---------------+

Page Access and Masking operation

The example below shows a simplified view of how pages are accessed when the data is read out of order and there is a lock.

Note: A NonClustered index might read the data out of order. NonClustered Indexes can read data in different orders. This example is an illustration of out of order read. Some reads will acquire different types of locks. 

 

  • Step 0 - The masking job has started and the engine is starting to fetch 20,000 rows (default). Since the data is read out of order -multiple pages need to be accessed. Also, in this example, there is only one Page lock acquired at any time (darker green) and then released (light green). The same page can be accessed multiple times. 
  • Step 1 - The fetched records are starting to buffer (grey) in the Masking Engine. The UPDATE is starting as soon as records are masked. To show this, a separate row has been drawn.
    • The Update is done in the order they are read and hence the updated pages are also scattered.
  • Step 2 - The buffers are full and the fetch of records in the SELECT are paused. More records and pages are updated (and locked). 
  • Step 3 - When buffers come under a certain limit, more records are fetched. 
    • This is now when we are at risk to have locks. There are a number of Page locks due to UPDATE and they are scattered. The masking engine needs to fetch more records and these are also scattered. 
    • If a page needs to be accessed (x) to fetch data that already has a lock due to the UPDATE - the process locks.

KBA6319 - Out Of Order and Lock Pages.png

Nonclustered Index and WHERE

When there is a WHERE clause in the SELECT statement. The same number of pages is read since the page needs to be fetched to read the record and apply the WHERE clause (this might change if an index is used on the filtered column). 

The number of records fetched into the masking engine for masking is not less (due to the filter) but when updating each transaction will still contain the number of records defined in Commit Size, hence an even higher risk to have a lock (or deadlock). 

This can be seen in the Clustered Index example above. 

 

Good query 

SELECT
    OBJECT_NAME(ps.object_id) as TableName,
    i.name as IndexName,
    ps.index_type_desc,
    ps.page_count,
    ps.avg_fragmentation_in_percent,
    ps.forwarded_record_count
FROM sys.dm_db_index_physical_stats (DB_ID(), NULL, NULL, NULL, 'DETAILED') AS ps
INNER JOIN sys.indexes AS i
    ON ps.OBJECT_ID = i.OBJECT_ID  
    AND ps.index_id = i.index_id
WHERE OBJECT_NAME(ps.object_id) = 'Demo'