Skip to main content
Delphix

Troubleshooting File Masking Performance (KBA6796)

 

 

KBA

KBA# 6796

At a Glance 

Summary: In the following article, you will find performance troubleshooting steps for File Masking. This covers both On-The-Fly (OTF) and In-Place (IP) and different types of connections - including FTP, SFTP, and File Mounts
Bottlenecks and actions: The following are some possible bottlenecks and suggested actions: 

File Input and File Output:
  • Throughput performance.
  • Latency, throughput, and file size optimization are important.
    • Slow latency and throughput performance will slow down file transfer.
    • Large files take longer to transfer.

Masking:

  • Check the algorithms used (are they optimal for performance).
    • New versions of algorithms are faster (especially Segment Mapping).
  • Check the CPU utilization. 
DIY Investigation: Consider the following troubleshooting steps: 
  1. Check the file transfer performance (ping, latency, and throughput).
  2. Check file sizes (even if only one field is masked - the whole file needs to be read).
  3. Check the Algorithms.
  4. Check Job Configuration - ensure Row Limit is set (between 10,000 to 50,000).
  5. Try OTF masking with no masked fields - what is the performance?
Support Investigation: If Delphix Support is engaged to investigate the performance, provide the following:
  1. Detail results from ping (what is the latency).  
  2. To collect performance data, rerun a job with these Job Configuration to: 
    • Feedback Size to 10,000 
    • Row Limit to 50,000
    • One File
  3. Run masking job (cancel after 5 min).
  4. Upload a Support Bundle. 
  5. Detail the Masking Job ID and Execution ID.
Older versions: If the version is 6.0.3 or earlier, it is highly recommended to upgrade to the latest version. The new versions have improved memory management and better performance. Some features might not be available on these versions. 

Applicable Delphix Versions

Click here to view the versions of the Delphix engine to which this article applies
Major Release All Sub Releases
6.0 6.0.0.0, 6.0.1.0, 6.0.1.1, 6.0.2.0, 6.0.2.1, 6.0.3.0, 6.0.3.1, 6.0.4.0, 6.0.4.1, 6.0.4.2, 6.0.5.0, 6.0.6.0, 6.0.6.1, 6.0.7.0, 6.0.8.0, 6.0.8.1, 6.0.9.0, 6.0.10.0, 6.0.10.1, 6.0.11.0, 6.0.12.0, 6.0.12.1, 6.0.13.0, 6.0.13.1, 6.0.14.0, 6.0.15.0, 6.0.16.0, 6.0.17.0, 6.0.17.1, 6.0.17.2

5.3

5.3.0.0, 5.3.0.1, 5.3.0.2, 5.3.0.3, 5.3.1.0, 5.3.1.1, 5.3.1.2, 5.3.2.0, 5.3.3.0, 5.3.3.1, 5.3.4.0, 5.3.5.0 5.3.6.0, 5.3.7.0, 5.3.7.1, 5.3.8.0, 5.3.8.1, 5.3.9.0

5.2

5.2.2.0, 5.2.2.1, 5.2.3.0, 5.2.4.0, 5.2.5.0, 5.2.5.1, 5.2.6.0, 5.2.6.1

5.1

5.1.0.0, 5.1.1.0, 5.1.2.0, 5.1.3.0, 5.1.4.0, 5.1.5.0, 5.1.5.1, 5.1.6.0, 5.1.7.0, 5.1.8.0, 5.1.8.1, 5.1.9.0, 5.1.10.0

5.0

5.0.1.0, 5.0.1.1, 5.0.2.0, 5.0.2.1, 5.0.2.2, 5.0.2.3, 5.0.3.0, 5.0.3.1, 5.0.4.0, 5.0.4.1 ,5.0.5.0, 5.0.5.1, 5.0.5.2, 5.0.5.3, 5.0.5.4

4.3

4.3.1.0, 4.3.2.0, 4.3.2.1, 4.3.3.0, 4.3.4.0, 4.3.4.1, 4.3.5.0

4.2

4.2.0.0, 4.2.0.3, 4.2.1.0, 4.2.1.1, 4.2.2.0, 4.2.2.1, 4.2.3.0, 4.2.4.0 , 4.2.5.0, 4.2.5.1

4.1

4.1.0.0, 4.1.2.0, 4.1.3.0, 4.1.3.1, 4.1.3.2, 4.1.4.0, 4.1.5.0, 4.1.6.0

 

 

File Masking Performance 

This Knowledge Article will look at how to troubleshoot File Masking Performance. This document covers all types of File Masking jobs:

  • Method: OTF and IP.
  • Type: Delimited, Fixed, and VSAM. 
  • Connection methods: FTP, SFTP, and File Mounts. 

The steps in this guide can be performed with available tools and will help to determine and answer the following questions:

  • Is the bottleneck reading the data (file input)?
  • Is the bottleneck the masking operation (algorithms and number of fields)?
  • Is the bottleneck the appliance - CPU or Memory?
  • Is the bottleneck writing the data (file output)?

File Masking Best Practice - OTF and IP  

The process between On-The-Fly (OTF) and In-Place (IP) masking is only different in the last stage where IP has a Rename (Overwrite) step. All other stages are the same. 

From a performance perspective, the Best Practice for File Masking is to use OTF Masking.

Note

Note:

Due to limitations on some (S)FTP Software, the rename in In-Place needs to be transferred over the network again in order to maintain file permission settings. This will add significantly to the masking job running time. 

Performance factors

The performance factors are: 

  1. The Masked File:
    • File Format - the number of masked fields and type of algorithms.
    • File Size - the number of bytes per row. 
  2. Network - latency and throughput.
  3. Masking Engine - the CPU performance (GHz and number of CPUs).
Note

Notes:

  • It is the File Format that defines the algorithms and the fields to be masked. 
  • In general, File Masking tends to have a larger number of bytes per row and tends to have more masked fields than database masking - hence these jobs can be slower than database masking.

 

Performance at different stages

As with all masking jobs, the process of masking a row/record goes through a set of stages. These stages can be identified and measured (usually easiest at the start of the masking job). Measuring these stages can help determine a bottleneck and where it is located.

Versions 6.0.3 and earlier have a different process. These jobs are described in a separate section below. 

tip

Tip:

The performance bottlenecks can be measured when the masking job finishes as well but it is usually not practical as the job has to finish successfully before the metrics can be uploaded into the support bundle. Furthermore, the max performance in each stage will not be able to be determined (as this is only detected at the start of the job and using detailed configurations).

 

A normal On-The-Fly masking job execution has the following characteristics:

  1. Initial File Input
    • The first 10,000 to 20,000 records are normally read fast as rows are being read into buffers on the masking engine.
    • This stage is usually visible at the start of the job and will show the Input performance before masking rows.
  2. Masking Operation  
    • The masking performance usually drops when records are masked. How much depends on factors such as algorithms and CPU.
    • File Masking performance is usually capped by the masking operation.
  3. File Output
    • The File Output step is the last step that can dictate the performance.
    • If this step is the slowest all buffers will be full and the job will be capped by the performance of this step. 

File Input and Output Performance Graph 

By temporarily setting the Feedback Size to 10,000 and the Row Limit to 50,000, the processing characteristics can be seen in the masking logs.

Below, the logs have been illustrated in a graph, making it easier to see the performance characteristics. This graph shows the performance as the buffers fill up and indicate the bottleneck. The graph shows the relative performance compared with the rpm of the overall masking job.

How to read the data

  • The relative performance is in relation to the performance masking the file in total (the rpm value in the UI). 
  • The key is to find where this value is around 1 as this indicates the bottleneck. 
  • The goal is to improve this stage of the masking operation. 

Example

In the example, the File Out is capping performance and the characteristics are: 

  • The 'File In' performance is initially the max performance that can be fetched from the file source. 
    • The 'File In' will be at max speed until the initial buffers are full (around 20,000 - 30,0000 rows).
    • Here ~3.8 x the average performance (so much faster). 
       
  • As soon as the records start to appear in the masking engine they will be sent to the File Out step. 
  • The 'File Out' performance can therefore be seen from the beginning of the job. 
     
  • When input buffers are full, the Masking performance is seen. 
    • It is here ~2.1x the average performance.
       
  • When all (relevant) buffers are full, then all steps are forced to the same performance as the bottleneck (in this case, File Out).
  • To improve the job below, 'File Out' needs to be improved. 

 

KBA6796_-_Rel_Perf_v3_-_File_Out_Capping_w_Descr.png

Bottlenecks and actions  

No two masking jobs are exactly the same and the characteristics will differ. With that said, there are some key bottleneck cases based on where the main bottleneck is located.

Capped on File Input

In this case, the performance will not be faster than the Input step and hence capped on File Input.

Actions:

  • Investigate File Input Throughput and Network Latency.
  • Investigate File Size - the number of bytes per row in the file. 

KBA6796_-_Rel_Perf_v3_-_File_In_Capping.png

Capped Masking  

In this case, the Input will reach max performance and when the initial buffers are full the performance will be capped by the Masking operation.

 

Actions:

  • Check the number of masked fields.
  • Check the algorithms used (are they optimal for performance).
  • Check the capacity of the masking engine (CPU usage).

KBA6796_-_Rel_Perf_v3_-_Mask_Capping.png

Capped File Output

In this case, the performance is good on Input and on Masking but is capped by the File Out

 

Actions:

  • Investigate File Out Throughput and Network Latency.
  • Investigate File Size - especially the size of the masked data. If the bytes per row is large - maybe look at changing the algorithms to reduce the size. 

KBA6796_-_Rel_Perf_v3_-_File_Out_Capping.png

6.0.3 and earlier

6.0.3 and earlier did the file masking in two distinct stages: 1. Input and Masking and 2. Output, with a sort step between the two.

Due to this, the performance stages (and graph) look very different. The file needed to be fully masked before the commencement of the File Output stage. This also meant that memory requirements were much higher. 

For this reason, the File Out is shown at the end of the graph to indicate that the File Out happens in a separate stage. In the graph, the lines are dotted to indicate that this is ongoing until all rows have been masked. 

Bottlenecks and actions

Capped on File Input 

Usually In and Out are the same but in this case to illustrate capping on Mask In, the Input step is slower (hence File Out in pre 6.0.4 is fast).

  • Investigate File Input Throughput and Network Latency.
  • Investigate File Size - the number of bytes per row in the file (big files will take time to transfer). 
Capped on Masking

When masking is capping performance, we can see that when Input peaks and then the performance drops.

  • Check the number of masked fields.
  • Check the algorithms used (are they optimal for performance).
  • Check the capacity of the masking engine (CPU usage).
Capped on Sort

This is only applicable to versions before 6.0.4. In these versions, the masked data is sent To the Sort step.

  • The issue is the SORT step and it is bottlenecking - likely due to memory issues.
Capped on File Output

The File Output (orange line) is slower than the rest:

  • Investigate File Out Throughput and Network Latency.
  • Investigate File Size - especially the size of the masked data.
    • If the bytes per row is large - maybe look at changing the algorithms to reduce the size. 
Capped on File Input Capped on Masking 
KBA6796_-_Rel_Perf_v3_w_Sort_-_File_In_Capping.png KBA6796_-_Rel_Perf_v3_w_Sort_-_Mask_Capping.png
Capped on Sort Capped on File Output
KBA6796_-_Rel_Perf_v3_w_Sort_-_Sort_Capping.png KBA6796_-_Rel_Perf_v3_w_Sort_-_File_Out_Capping.png

Collect data for Support Investigation 

If needed, the performance can be investigated by support. 

  1. Detail:
    • The file masking performance from the UI and the expected performance.
    • Measured Network Latency from ping.
    • Measured File Transfer Throughput.
    • If the File transfer (Input and Output) is from and to the same server or different servers.
       
  2. Temporarily set Job Configuration to:
  3. important

    Important:

    Remember to reset the Feedback Size to the original size (10,000 is too small for large jobs and will cause the logs to grow).

    1. Feedback Size to 10,000.
    2. Row Limit to 50,000.
    3. Streams = 1 or select one file.
    4. Run a masking job.
  1. Upload a Support Bundle
  2. Detail the Masking Job ID, Execution ID, and the file to investigate. 

Details to collect from the UI:

  • The Job ID, Execution ID, and the file name. 
  • The number of masked rows.
  • The rpm for the table.