Skip to main content
Delphix

Masking File - Row Limit and Feedback Size (KBA10636)

 

 

KBA

KBA#
10636

At a Glance  

Available in versions From 6.0.4.0 forward
Row Limit Row Limit

Sets the maximum number of records that will be in the masking engine for that object at any specific time. This will manage the amount of memory used by the job. 
  • Default: 20,000
    • Due to internal throttling, keep this value above 2,000. 
Feedback Size Feedback Size

This will set how many rows are processed for each step before writing an entry into the logs. This parameter does not affect performance. 
  • Default: 50,000
    • Increase this on large files.
    • For large files set to 500,000 and for larger to 5,000,000 (a good estimate is: max rows/500).

 

Too many log entries can cause the engine to run out of storage space. On large files adjust this value.

More Info For more information:

Reviewing the logs:


For databases:

Applicable Delphix Versions

Click here to view the versions of the Delphix engine to which this article applies
Major Release All Sub Releases
All, starting at 6.0.4.0 6.0.4.0 and newer

Overview

This article details two Masking Job parameters:

  • Row Limit - the number of max rows/records 'in-flight' for each masked object. 
  • Feedback Size - sets the number of rows processed before writing a feedback entry into the logs.

Row Limit

The Row Limit is a new feature (available from 6.0.4.0 forward). This value sets the maximum number of records there can be in the masking engine for a specific object at any given moment. 

The best Row Limit value depends on:

 

  • Memory usage requirements (row data size in bytes).
  • Masking job Type:
    • File / Main Frame / DB
    • In-Place / On-The-Fly
tip

Technical
Info:

There is a second upper limit on the engine which is 10,000 rows per step in the masking job. This value is not configurable, hence the introduction of Row Limit. It is implemented in the Input step and it checks with the Output step and never exceeds the number of rows in the engine as defined by the Row Limit. 

The buffers also have a lower limit of 100 rows. If this is not reached, the step will wait a few milliseconds (to see if more values are received) before emptying the buffer. This has the effect that if the Row Limit is close to the sum of this lower limit, the steps will pause for milliseconds and performance will not be optimal.

Default value

For most masking jobs the default value is optimal and does not need to be changed. The default value is blank (representing 20,000). 

Effects

The effects of changing the Row Limit:

  • Smaller Row Limit:
    • Reduces the amount of data in the engine per masked object. 
    • This reduces the chance of Out of Memory errors. 
    • Too small values will degrade performance (see Technical Info above).
       
  • Larger Row Limit:
    • This might increase the chance of Out of Memory errors.
       
  • Disabled 
    • Setting Row Limit = 0 will disable this feature. 
    • The upper limit is then defined by the 10,000 rows per step.
    • If the job is hanging, try disabling the Row Limit.

When to change?

This is a guide to when this value might need to be adjusted:

  • Masking Out Of Memory errors - reduce the Row Limit.
    • This is especially important on large file masking jobs, where each row has a large number of characters.
    • Large OTF jobs might also benefit from a reduced Row Limit.
    • It might be better to look at the size of each row - does all that data need to be transferred?
    • Start dropping the value by 10 times. 

Logs and Memory usage

The log files can be used to view how much memory a job uses. 

The JobMemoryManager will indicate how much Heap is/was used. 

  • For optimal memory, the Heap should be around 20%.
  • In the case below, the memory is 4 GB and can be reduced to 1 Gb (since the Heap is 1%).
Text File Input.0 - JobMemoryManager: Total Pause <row limit disabled>/xx ms Heap 62695888b of 4273995776b (1%) 

Feedback Size

Default value

For most masking jobs the default value is optimal and does not need to be changed. The default value is blank (representing 50,000). 

When to change?

The Feedback Size should be adjusted in these scenarios: 

  • If the number of masked rows is large, then increase the Feedback Size to 500,000 or even 5,000,000.
     

The Feedback Size defines how frequently logs are written to the log files. These values are a guideline. One way to determine the appropriate size is that the logs should preferably fit into one log file. 

Size Number of Records Feedback Size
Performance Test - 10,000
Small to medium ~ 5,000,000  50,000 (default)
Large Up to 500,000,000 500,000
Very large Up to  5,000,000,000 5,000,000
Super large Over 5,000,000,000 50,000,000

Effects

This value does NOT affect performance. It is only related to how frequently logs are written to the logs. Values set too low will cause a large number of logs to be collected, which could ultimately affect the masking engine (in worst-case scenarios, crash the engine). 

Warning

Warning:

If the Feedback Size is set too small on large tables, this can cause very large logs and can cause the engine to crash.

 

Issue

If there is only one Record Type and it has a filter, the masking engine might hang (DLPX-89175). Setting the Row Limit to 0 (disable) will work around this issue.