Troubleshooting File Masking Performance Issues (KBA6796)
KBA
KBA# 6796At a Glance
Summary: | Below are Performance troubleshooting steps for File Masking. This covers both On-The-Fly (OTF) and In-Place (IP) and different types of connections - including FTP, SFTP, and File Mounts. |
---|---|
DIY Investigation: | These are troubleshooting steps that you can do:
|
Support Investigation: | If Delphix Support should investigate the performance, these are the steps:
|
Bottlenecks and actions: | These are possible bottlenecks and next actions: File Input:
Masking:
Sort (up to 6.0.3 only):
File Output:
|
Applicable Delphix Versions
- Click here to view the versions of the Delphix engine to which this article applies
-
Major Release All Sub Releases 6.0 6.0.0.0, 6.0.1.0, 6.0.1.1, 6.0.2.0, 6.0.2.1, 6.0.3.0, 6.0.3.1, 6.0.4.0, 6.0.4.1, 6.0.4.2, 6.0.5.0 5.3
5.3.0.0, 5.3.0.1, 5.3.0.2, 5.3.0.3, 5.3.1.0, 5.3.1.1, 5.3.1.2, 5.3.2.0, 5.3.3.0, 5.3.3.1, 5.3.4.0, 5.3.5.0 5.3.6.0, 5.3.7.0, 5.3.7.1, 5.3.8.0, 5.3.8.1, 5.3.9.0 5.2
5.2.2.0, 5.2.2.1, 5.2.3.0, 5.2.4.0, 5.2.5.0, 5.2.5.1, 5.2.6.0, 5.2.6.1
5.1
5.1.0.0, 5.1.1.0, 5.1.2.0, 5.1.3.0, 5.1.4.0, 5.1.5.0, 5.1.5.1, 5.1.6.0, 5.1.7.0, 5.1.8.0, 5.1.8.1, 5.1.9.0, 5.1.10.0
5.0
5.0.1.0, 5.0.1.1, 5.0.2.0, 5.0.2.1, 5.0.2.2, 5.0.2.3, 5.0.3.0, 5.0.3.1, 5.0.4.0, 5.0.4.1 ,5.0.5.0, 5.0.5.1, 5.0.5.2, 5.0.5.3, 5.0.5.4
4.3
4.3.1.0, 4.3.2.0, 4.3.2.1, 4.3.3.0, 4.3.4.0, 4.3.4.1, 4.3.5.0
4.2
4.2.0.0, 4.2.0.3, 4.2.1.0, 4.2.1.1, 4.2.2.0, 4.2.2.1, 4.2.3.0, 4.2.4.0 , 4.2.5.0, 4.2.5.1
4.1
4.1.0.0, 4.1.2.0, 4.1.3.0, 4.1.3.1, 4.1.3.2, 4.1.4.0, 4.1.5.0, 4.1.6.0
File Masking Performance
This Knowledge Article will look at how to troubleshoot File Masking Performance. Since On-The-Fly and In-Place File masking are exactly the same except the final Rename step in IP masking, this document covers both.
Connection methods covered are FTP, SFTP, and File Mounts.
The steps in this guide can be performed with available tools and will help to determine and answer the following questions:
- Is the bottleneck reading the data (File In)?
- Is the bottleneck the masking operation (algorithms and number of fields)?
- Is the bottleneck the output (File Out)?
File Masking Best Practice - OTF and IP
The File Masking process between On-The-Fly (OTF) and In-Place (IP) is only different in the last stage where IP has a Rename (Overwrite) step. All other stages are the same.
From a performance perspective, the Best Practice for File Masking is to use OTF Masking.
For FTP and SFTP, how the rename in IP Masking will be executed depends on the implementation of the (S)FTP Software. It can be that the file needs to be transferred over the network again in order to be renamed. This will add significantly to the execution duration.
Performance factors
The performance factors are:
- The Masked File:
- File Format - the number of masked fields and type of algorithms.
- File Size - the number of bytes per row.
- Network - latency and throughput.
- Masking Engine - the CPU performance (GHz and number of CPUs).
Masking Performance at different stages
As with all masking jobs, the performance will differ at different stages. These stages can be identified and measured at the start of the masking job. Measuring these stages can help determine where the bottleneck is located.
A normal On-The-Fly masking job execution has the following characteristics:
- Initial File Input
- The first 10,000 to 20,000 records are normally read fast as rows are being read into buffers on the masking engine.
- Hence, this stage is usually visible at the start of the job and will show the Input performance before masking rows.
- Masking Operation
- The result of the masking operation can also be seen in the graph below.
- The masking performance usually drops when records are masked. How much depends on the masking performance factors (see above).
- File Masking performance is usually capped by the masking operation.
- File Output
- The File Output step is the last step that can dictate the performance.
- If this step is the slowest all buffers will be full and the job will be capped by the performance of this step.
File Input and Output Performance Graph
By temporarily setting the Feedback Size to 10,000 and the Row Limit to 50,000, the processing characteristics can be illustrated in a graph. This graph shows the performance as the buffers fill up and this will indicate the bottleneck. The graph shows the relative performance compared with the rpm of the overall masking job.
The graph and how to read it
- The relative performance is in relation to the performance masking the file in total (the rpm value in the UI).
- The key is to find where this value is around 1 as this indicates the bottleneck.
- The goal is to improve this stage of the masking operation.
Example
In the example File Out is capping performance and the characteristics are:
- The File In performance is initially the max performance that can be fetched from the file source.
- The File In will be at max speed until the initial buffers are full (around 20,000 - 30,0000 rows).
- As soon as the records start to appear in the masking engine they will be sent to the File Out step (6.0.4 and up).
- The File Out performance can therefore be seen from the beginning of the job (6.0.4. and up).
- When input buffers are full, the Masking performance is seen.
- When all (relevant) buffers are full, then all steps are forced to the same performance as the bottleneck (in this case, File Out).
- To improve the job below, File Out needs to be improved.
Example of Bottlenecks
No two masking job is exactly the same and the characteristics will differ. With that said, there are some key bottleneck cases based on where the main bottleneck is located.
Pre 6.0.4 with SORT has been shown with the File Out at the end of the graph to indicate that the File Out happens in a separate stage. In the graph, the lines are dotted to indicate that this is ongoing until all rows have been masked. At that stage, the Sort finishes, and File Out starts.
Capped File In
In this case, the performance will not be faster than the Input step and hence capped on File In.
Actions:
- Investigate File Input Throughput and Network Latency.
- Investigate File Size - the number of bytes per row in the file.
Pre 6.0.4 (with SORT) | 6.0.4 and beyond |
![]() |
![]() |
Capped Masking
In this case, the Input will reach max performance and when the initial buffers are full the performance will be capped by the Masking operation.
Actions:
- Check the number of masked fields.
- Check the algorithms used (are they optimal for performance).
- Check the capacity of the masking engine.
Pre 6.0.4 (with SORT) | 6.0.4 and beyond |
![]() |
![]() |
Capped To Sort
This is only applicable to versions before 6.0.4. In these versions, the masked data is sent To the Sort step.
Actions:
- Pre 6.0.4:
- The issue is the SORT step and it is bottlenecking - likely due to memory issues.
Pre 6.0.4 with (SORT) | 6.0.4 and beyond |
![]() |
Not Applicable |
Capped File Out
In this case, the performance is capped by the File Out.
Actions:
- Investigate File Out Throughput and Network Latency.
- Investigate File Size - especially the size of the masked data. If the bytes per row is large - maybe look at changing the algorithms to reduce the size.
Pre 6.0.4 with (SORT) | 6.0.4 and beyond |
![]() |
![]() |
Collect data for Support Investigation
If needed, the performance can be investigated by support.
- Detail:
- The file masking performance from the UI and the expected performance.
- Measured Network Latency from ping.
- Measured File Transfer Throughput.
- If the File transfer is from and to the same server or different servers.
- Temporarily set Job Configuration to:
- Feedback Size to 10,000.
- Row Limit to 50,000.
- Streams = 1 or select one file.
- Run a masking job.
- Upload a Support Bundle.
- Detail the Masking Job ID, Execution ID, and the file to investigate.
Details to collect from the UI:
- The Job ID, Execution ID, and the file name.
- The number of masked rows.
- The rpm for the table.
Related Articles
The following articles may provide more information or related information to this article:
- KBA Configuration:
- Other Performance articles: