Skip to main content
Delphix

Troubleshooting File Masking Performance Issues (KBA6796)

 

KBA

KBA# 6796

At a Glance 

Summary: Below are Performance troubleshooting steps for File Masking. This covers both On-The-Fly (OTF) and In-Place (IP) and different types of connections - including FTP, SFTP, and File Mounts
DIY Investigation: These are troubleshooting steps that you can do: 
  1. Measure latency using ping.
  2. Try OTF masking with no masked fields - what is the performance?
Support Investigation: If Delphix Support should investigate the performance, these are the steps:
  1. Detail results from ping and any File Masking investigations.  
  2. Set Job Configuration to: 
    • Feedback Size to 10,000 
    • Row Limit to 50,000 (on 6.0.4 onwards).
    • One File
  3. Run masking job (cancel after 5 min).
  4. Upload a Support Bundle. 
  5. Detail the Masking Job ID and Execution ID.
Bottlenecks and actions: These are possible bottlenecks and next actions: 

File Input:
  • File reading throughput performance.
  • Latency and throughput might need to be improved. 

Masking:

  • Check the number of masked files.
  • Check the algorithms used (are they optimal for performance).
  • Check the CPU utilization. 

Sort (up to 6.0.3 only):

  • This step has been removed from 6.0.4 onwards.
  • The step is memory related.
  • Might need to split the file into smaller files to optimize performance.

File Output:

  • File writing throughput performance.
  • Latency and throughput might need to be improved. 

Applicable Delphix Versions

Click here to view the versions of the Delphix engine to which this article applies
Major Release All Sub Releases
6.0 6.0.0.0, 6.0.1.0, 6.0.1.1, 6.0.2.0, 6.0.2.1, 6.0.3.0, 6.0.3.1, 6.0.4.0, 6.0.4.1, 6.0.4.2, 6.0.5.0

5.3

5.3.0.0, 5.3.0.1, 5.3.0.2, 5.3.0.3, 5.3.1.0, 5.3.1.1, 5.3.1.2, 5.3.2.0, 5.3.3.0, 5.3.3.1, 5.3.4.0, 5.3.5.0 5.3.6.0, 5.3.7.0, 5.3.7.1, 5.3.8.0, 5.3.8.1, 5.3.9.0

5.2

5.2.2.0, 5.2.2.1, 5.2.3.0, 5.2.4.0, 5.2.5.0, 5.2.5.1, 5.2.6.0, 5.2.6.1

5.1

5.1.0.0, 5.1.1.0, 5.1.2.0, 5.1.3.0, 5.1.4.0, 5.1.5.0, 5.1.5.1, 5.1.6.0, 5.1.7.0, 5.1.8.0, 5.1.8.1, 5.1.9.0, 5.1.10.0

5.0

5.0.1.0, 5.0.1.1, 5.0.2.0, 5.0.2.1, 5.0.2.2, 5.0.2.3, 5.0.3.0, 5.0.3.1, 5.0.4.0, 5.0.4.1 ,5.0.5.0, 5.0.5.1, 5.0.5.2, 5.0.5.3, 5.0.5.4

4.3

4.3.1.0, 4.3.2.0, 4.3.2.1, 4.3.3.0, 4.3.4.0, 4.3.4.1, 4.3.5.0

4.2

4.2.0.0, 4.2.0.3, 4.2.1.0, 4.2.1.1, 4.2.2.0, 4.2.2.1, 4.2.3.0, 4.2.4.0 , 4.2.5.0, 4.2.5.1

4.1

4.1.0.0, 4.1.2.0, 4.1.3.0, 4.1.3.1, 4.1.3.2, 4.1.4.0, 4.1.5.0, 4.1.6.0

 

 

File Masking Performance 

This Knowledge Article will look at how to troubleshoot File Masking Performance. Since On-The-Fly and In-Place File masking are exactly the same except the final Rename step in IP masking, this document covers both. 

Connection methods covered are FTP, SFTP, and File Mounts. 

The steps in this guide can be performed with available tools and will help to determine and answer the following questions:

  • Is the bottleneck reading the data (File In)?
  • Is the bottleneck the masking operation (algorithms and number of fields)?
  • Is the bottleneck the output (File Out)?

File Masking Best Practice - OTF and IP  

The File Masking process between On-The-Fly (OTF) and In-Place (IP) is only different in the last stage where IP has a Rename (Overwrite) step. All other stages are the same. 

From a performance perspective, the Best Practice for File Masking is to use OTF Masking.

For FTP and SFTP, how the rename in IP Masking will be executed depends on the implementation of the (S)FTP Software. It can be that the file needs to be transferred over the network again in order to be renamed. This will add significantly to the execution duration. 

Performance factors

The performance factors are: 

  1. The Masked File:
    • File Format - the number of masked fields and type of algorithms.
    • File Size - the number of bytes per row. 
  2. Network - latency and throughput.
  3. Masking Engine - the CPU performance (GHz and number of CPUs).
Note

Notes:

  • It is the File Format that defines the algorithms and the fields to be masked. The file can have fewer fields but the File Format will define how many columns to mask. The other big factor is the number of bytes per row in the file.
  • In general, File Masking tends to have a larger number of bytes per row and tends to have more masked fields than database masking.

 

Masking Performance at different stages

As with all masking jobs, the performance will differ at different stages. These stages can be identified and measured at the start of the masking job. Measuring these stages can help determine where the bottleneck is located.

tip

Tip:

The performance bottlenecks can be measured when the masking job finishes as well but it is usually not practical as the job has to finish successfully before the metrics can be uploaded into the support bundle. Furthermore, the max performance in each stage will not be able to be determined (as this is only detected at the start of the job and using detailed configurations).

 

A normal On-The-Fly masking job execution has the following characteristics:

  1. Initial File Input
    • The first 10,000 to 20,000 records are normally read fast as rows are being read into buffers on the masking engine.
    • Hence, this stage is usually visible at the start of the job and will show the Input performance before masking rows.
  2. Masking Operation  
    • The result of the masking operation can also be seen in the graph below.
    • The masking performance usually drops when records are masked. How much depends on the masking performance factors (see above).
    • File Masking performance is usually capped by the masking operation.
  3. File Output
    • The File Output step is the last step that can dictate the performance.
    • If this step is the slowest all buffers will be full and the job will be capped by the performance of this step. 

File Input and Output Performance Graph 

By temporarily setting the Feedback Size to 10,000 and the Row Limit to 50,000, the processing characteristics can be illustrated in a graph. This graph shows the performance as the buffers fill up and this will indicate the bottleneck. The graph shows the relative performance compared with the rpm of the overall masking job.

The graph and how to read it

  • The relative performance is in relation to the performance masking the file in total (the rpm value in the UI). 
  • The key is to find where this value is around 1 as this indicates the bottleneck. 
  • The goal is to improve this stage of the masking operation. 

Example

In the example File Out is capping performance and the characteristics are: 

  • The File In performance is initially the max performance that can be fetched from the file source. 
  • The File In will be at max speed until the initial buffers are full (around 20,000 - 30,0000 rows).
  • As soon as the records start to appear in the masking engine they will be sent to the File Out step (6.0.4 and up). 
  • The File Out performance can therefore be seen from the beginning of the job (6.0.4. and up). 
  • When input buffers are full, the Masking performance is seen. 
  • When all (relevant) buffers are full, then all steps are forced to the same performance as the bottleneck (in this case, File Out).
  • To improve the job below, File Out needs to be improved. 

 

KBA6796 - Rel Perf v3 - File Out Capping w Descr.png

Example of Bottlenecks  

No two masking job is exactly the same and the characteristics will differ. With that said, there are some key bottleneck cases based on where the main bottleneck is located. 

Pre 6.0.4 with SORT has been shown with the File Out at the end of the graph to indicate that the File Out happens in a separate stage. In the graph, the lines are dotted to indicate that this is ongoing until all rows have been masked. At that stage, the Sort finishes, and File Out starts. 

Capped File In

In this case, the performance will not be faster than the Input step and hence capped on File In.

Note

Note:

Pre 6.0.4 - these versions have a SORT step splitting the job into two parts. Usually In and Out are the same but in this case to illustrate capping on Mask In, the Input step is slower (hence File Out in pre 6.0.4 is fast).

Actions:

  • Investigate File Input Throughput and Network Latency.
  • Investigate File Size - the number of bytes per row in the file. 
Pre 6.0.4 (with SORT) 6.0.4 and beyond
KBA6796 - Rel Perf v3 w Sort - File In Capping.png KBA6796 - Rel Perf v3 - File In Capping.png
Capped Masking  

In this case, the Input will reach max performance and when the initial buffers are full the performance will be capped by the Masking operation.

Actions:

  • Check the number of masked fields.
  • Check the algorithms used (are they optimal for performance).
  • Check the capacity of the masking engine.
Pre 6.0.4 (with SORT) 6.0.4 and beyond
KBA6796 - Rel Perf v3 w Sort - Mask Capping.png KBA6796 - Rel Perf v3 - Mask Capping.png
Capped To Sort

This is only applicable to versions before 6.0.4. In these versions, the masked data is sent To the Sort step.

Note

Notes:

  • The bottleneck in this step might only be seen later in the job execution (due to latent memory issues).
  • In those cases, the relative performance will be higher initially.

 

Actions:

  • Pre 6.0.4:
    • The issue is the SORT step and it is bottlenecking - likely due to memory issues.
Pre 6.0.4 with (SORT) 6.0.4 and beyond
KBA6796 - Rel Perf v3 w Sort - Sort Capping.png

 

 

Not Applicable 

Capped File Out

In this case, the performance is capped by the File Out

Note

Notes:

  • The two implementations are different here.
  • Pre 6.0.4 - the File Out is from the Sort step.
  • On 6.0.4 and beyond - the Mask Out is a stream from the masking operations.

 

Actions:

  • Investigate File Out Throughput and Network Latency.
  • Investigate File Size - especially the size of the masked data. If the bytes per row is large - maybe look at changing the algorithms to reduce the size. 
Pre 6.0.4 with (SORT) 6.0.4 and beyond
KBA6796 - Rel Perf v3 w Sort - File Out Capping.png KBA6796 - Rel Perf v3 - File Out Capping.png

 

Collect data for Support Investigation 

If needed, the performance can be investigated by support. 

  1. Detail:
    • The file masking performance from the UI and the expected performance.
    • Measured Network Latency from ping.
    • Measured File Transfer Throughput.
    • If the File transfer is from and to the same server or different servers.
       
  2. Temporarily set Job Configuration to:
  3. important

    Important:

    Remember to reset the Feedback Size to the original size (10,000 is too small for large jobs and will cause the logs to grow).

    1. Feedback Size to 10,000.
    2. Row Limit to 50,000.
    3. Streams = 1 or select one file.
    4. Run a masking job.
  1. Upload a Support Bundle
  2. Detail the Masking Job ID, Execution ID, and the file to investigate. 

Details to collect from the UI:

  • The Job ID, Execution ID, and the file name. 
  • The number of masked rows.
  • The rpm for the table.