Skip to main content
Delphix

Troubleshooting Masking Performance In-Place Generic DB (KBA6409)

 

KBA

KBA# 6409

 

At a Glance Edit section 

Summary: Masking Performance troubleshooting steps for a generic database. 

Special steps are available for Oracle and MS SQL Server

DIY Investigation: Steps: 
  1. Measure latency using ping.
  2. Measure duration for:
    • SELECT COUNT
    • SELECT
    • UPDATE
  3. If possible, Execution Plans for the above statements. 
Support Investigation: Steps:
  1. Upload the details collected above.
  2. Set Feedback Size to 10,000 and run one job with Streams = 1.
  3. Upload a Support Bundle. 
  4. Detail the Masking Job ID and Execution ID.

Applicable Delphix Versions

Click here to view the versions of the Delphix engine to which this article applies
Major Release All Sub Releases
6.0 6.0.0.0, 6.0.1.0, 6.0.1.1, 6.0.2.0, 6.0.3.0, 6.0.3.1

5.3

5.3.0.0, 5.3.0.1, 5.3.0.2, 5.3.0.3, 5.3.1.0, 5.3.1.1, 5.3.1.2, 5.3.2.0, 5.3.3.0, 5.3.3.1, 5.3.4.0, 5.3.5.0 5.3.6.0, 5.3.7.0, 5.3.7.1, 5.3.8.0, 5.3.8.1, 5.3.9.0

5.2

5.2.2.0, 5.2.2.1, 5.2.3.0, 5.2.4.0, 5.2.5.0, 5.2.5.1, 5.2.6.0, 5.2.6.1

5.1

5.1.0.0, 5.1.1.0, 5.1.2.0, 5.1.3.0, 5.1.4.0, 5.1.5.0, 5.1.5.1, 5.1.6.0, 5.1.7.0, 5.1.8.0, 5.1.8.1, 5.1.9.0, 5.1.10.0

5.0

5.0.1.0, 5.0.1.1, 5.0.2.0, 5.0.2.1, 5.0.2.2, 5.0.2.3, 5.0.3.0, 5.0.3.1, 5.0.4.0, 5.0.4.1 ,5.0.5.0, 5.0.5.1, 5.0.5.2, 5.0.5.3, 5.0.5.4

4.3

4.3.1.0, 4.3.2.0, 4.3.2.1, 4.3.3.0, 4.3.4.0, 4.3.4.1, 4.3.5.0

4.2

4.2.0.0, 4.2.0.3, 4.2.1.0, 4.2.1.1, 4.2.2.0, 4.2.2.1, 4.2.3.0, 4.2.4.0 , 4.2.5.0, 4.2.5.1

4.1

4.1.0.0, 4.1.2.0, 4.1.3.0, 4.1.3.1, 4.1.3.2, 4.1.4.0, 4.1.5.0, 4.1.6.0

DB and Masking Performance  

This generic Knowledge Article will look at how to troubleshoot masking performance. With masking performance, we will only look at the rpm (rows per minute) of the actual masking operation (In-Place only). It will not include the duration it takes to work on Constraints and Indexes or Profiling performance. 

The steps in this guide can be performed with available tools and will help to determine and answer the following questions:

  • Is the bottleneck reading the data (SELECT)?
  • Is the bottleneck the masking operation (algorithms and number of columns)?
  • Is the bottleneck the output (UPDATE)?

Performance factors

The groups and the key factors to performance are: 

  1. Masking Engine - the number of masked columns, data types, and algorithms.
  2. Network - latency and throughput.
  3. Database - read and write performance, the number of records per block, and fragmentation.  

 

The diagram below shows where the delays (in milliseconds) are accumulating to become slow performance. 

KBA6409 - Generic - Timing Events v2.png

The diagram excludes the throughput and the size of the data as this is normally not a performance-limiting factor. The time tc (time to count) is only included for reference.

  • COUNT: tc = tlc + tdbc + tother 
  • MASKING: tm
  • SELECT: ts = tls + tdbs + tother
  • UPDATE: tu = tlu + tdbu + tother

tother = Java, JDBC. 

Masking performance in rpm will be dictated by compounding and capping factors since some of the processing are done in parallel. 

To improve masking performance and rpm, the bottleneck needs to be found and resolved. 

Masking Performance at different stages  

The performance will differ at different stages of the masking job. These stages can be measured at the start of the masking job. Measuring these stages can help determine where the bottleneck is located.

The performance can be measured when the masking job finishes as well but it is usually not practical to measure these as one has to wait until the job finishes successfully. 

 

A normal In-Place masking job execution has the following characteristics:

  1. Initial SELECT
    • The first 10,000 to 20,000 records are normally read fast as the records are read into the buffers of the masking job.
    • This is usually visible at the start of the job and will show the performance of the SELECT before masking any records.
  2. Masking Transformation  
    • There are also buffers to be filled at the end of the job. 
    • The masking performance usually drops when records are masked. How much depends on the masking factors (see above).
  3. Update
    • The last step that can dictate the performance is the UPDATE step.
    • If this step is the slowest (normal case) all buffers will be full and the job is capped by the UPDATE performance. 

 

Generic Moving v1 (fast input w gauge).gif

SELECT and UPDATE Performance Graph  

By temporarily setting the Feedback Size to 10,000, the execution can be illustrated in a graph. This graph shows the performance as the buffers fill up and this will indicate the bottleneck.

An illustration of this can be seen below, which shows the job above. The graph shows the relative performance compared with the rpm of the overall masking job.

 

The characteristics are: 

  • The SELECT performance is initially the max performance that can be fetched from the database. 
  • As soon as the records start to appear in the masking engine they will be sent to the UPDATE step.
  • Hence, the UPDATE performance can also be seen at the beginning of the job. 
  • The SELECT will be at max speed until the initial buffers are full (around 20,000 - 30,0000 rows).
  • The performance will then be (in this case) capped to the performance of the masking operation.
  • When all buffers (relevant) are full the SELECT step will be forced to the same performance as the UPDATE step.

Masking Relative Performance - SELECT Mask UPDATE.png

Example of Bottlenecks   

No masking job is exactly the same and the characteristics will differ. With that said, there are three key bottleneck cases based on where the main bottleneck is located. 

These cases are represented below (with example graphs): 

SELECT 

In this case, the performance will not be faster than the SELECT step and hence capped on SELECT.

Actions: investigate the execution plan for SELECT and network latency.

Masking operation  

In this case, the SELECT will reach max performance until the initial buffers are full and then be capped by the masking operation.

Actions: investigate the algorithms used and the capacity of the masking engine.

UPDATE  

This is the performance case detailed above. 

Actions: investigate the details around UPDATE and other operations on the database. 

 

Masking Relative Performance - SELECT Cap.png Masking Relative Performance - Masking Cap.png Masking Relative Performance - UPDATE Cap.png

Data Collection 

To find the bottleneck, we need to look at a few areas. Many of the reports will only show a limited amount of information. To see the complete picture, we need all data.

Some points to remember:  

  • Masking jobs:
    • Run one job with Stream = 1.
    • If multiple tables are executed, indicate which table should be focused on. 
  • Testing performance on the database:
    • Make sure to flush buffers and the cache before testing. 
  • Testing masking performance:
    • Set the algorithm to a fast algorithm (for example, TK) to see if that makes a difference. 

What to collect  

The following information needs to be collected: 

  • Masking Rule Set and Inventory.
    • For example: how many masked columns, algorithms, size of the data.
  • Database performance.
    • The number of rows in the table.
    • The durations from the database tests. 
    • Any execution plan.
  • Testing the masking performance.
    • Temporarily set Feedback Size to 10,000.
    • Test run the masking job.

Database performance tests  

Network performance   

Ping the Masking Engine and collect latency data. Run this a few time to get a good average value for the latency.

ping Your_Masking_Enging

SELECT COUNT Performance 

This will show how fast the database can count the number of records in the table. This is frequently done in the same way as SELECT below. The latency will not affect the duration of this query as only one value is returned. 

Actions

  • Change YOUR_TABLE to the table to test.
  • Take note of the number of records.
  • Take note of the duration to complete this statement (in seconds). 
SELECT COUNT(*) as COUNT_TEST from YOUR_TABLE

Database performance estimates 

With the metadata above, the select fetch performance for counting the records can be calculated (note that this is the performance on the database and excludes network latency and java processing times): 

  • Count Rows per Minute: RowCount / Duration_Secs * 60

SELECT Performance 

The SELECT is here encapsulated in a SELECT COUNT to reduce the output size. The duration is expected to be the same as above. 

Actions

  • Change YOUR_TABLE to the table to test.
  • Take note of the number of records.
  • Take note of the duration to complete this statement (in seconds). 
SELECT count(*) AS SELECT_TEST from (select * from YOUR_TABLE) as tst

Database performance estimates 

With the metadata above, the select fetch performance can be calculated (note that this is the performance on the database and excludes network latency and java processing times): 

  • Select Rows per Minute: RowCount / Duration_Secs * 60

UPDATE Performance 

The UPDATE performance is usually slower and a likely bottleneck. It is also harder to measure this as there is no generic way to perform this test.

The test needs to be performed on 100000 records. To do this a loop is needed which differs from database to database. 

Actions:

  • Code a loop of 100000 records.
  • Change YOUR_TABLE to the table to test.
  • Change Col1, Col2,... to the masked columns.
  • Take note of the number of records.
  • Take note of the duration to complete this statement (in seconds). 
Loop 100000 records
    UPDATE YOUR_TABLE set Col1 = Col1, Col2 = Col2, ...

Database performance estimates 

With the metadata above, the UPDATE performance can be calculated (note that this is the performance on the database and excludes network latency and java processing times): 

  • Update Rows per Minute: RowCount / Duration_Secs * 60

Table Metadata 

Some data about the table is also useful. 

Actions

  • Take note of the table description.
  • Take note of any triggers and indexes on the table.
  • If possible - collect the number of pages/blocks required to store this table on the database. 

Example Output

The example below is from a small table called 'MyTable'. The SQL to collect this will differ from database to database. 

+-----------+----------+------------+-----------+-------------+
| TableName | RowCount | TotalPages | UsedPages | RowsPerPage |
+-----------+----------+------------+-----------+-------------+
| MyTable   | 509223   | 1937       | 1933      | 263         |
+-----------+----------+------------+-----------+-------------+

Example calculations 

Crunching the numbers above we get the following result (here presented in a matrix). The details below are from a small VM. 

  • The UPDATE is 7.8 times slower than SELECT (650/83). This difference will be reflected in the masking job.
+---------------+-------------+----------+----------+---------+
|               |             |    COUNT |   SELECT |  UPDATE |
+---------------+-------------+----------+----------+---------+
|               |    Metadata |    47 ms |    47 ms |  360 ms |
+---------------+-------------+----------+----------+---------+
| Rows per Min  | 509223 rows | 650 Mrpm | 650 Mrpm | 83 Mrpm |
+---------------+-------------+----------+----------+---------+

Collect data for Support Investigation 

If needed, the performance can be investigated by support. 

  1. Upload the details collected above:
    • Table description.
    • Details about Indexes and Triggers.
    • SELECT and UPDATE statistics collected from the database. 
       
  2. Set Feedback Size to 10,000 and run one job with Streams = 1.
  3. Upload a Support Bundle. 
  4. Detail the Masking Job ID, Execution ID, and the table to investigate. 

 

Related Articles

The following articles may provide more information or related information to this article: