Best Practice: Masking Performance (Database) (KBA1561)
KBA
KBA#1561
Masking Performance Background
This article looks at masking performance and more specifically the overall time it takes to perform a masking job. The page will, therefore, go beyond looking at masking rows per minute (rpm) and include startup time as well.
It addresses the following performance areas:
- Startup Time - could be a cause if using algorithms with a large number of lookup values.
- Read Performance - unlikely an issue.
- Transformation Performance - frequent issue if there are many algorithms per table using heavy SM or Custom Algorithms.
- Update Performance - frequent issue related to database configuration or storage issues (also poor network).
- Post-Script Time - could be a cause if post-scripts are used or indexes/constraints need to be recreated
Summary
Startup Time
This is a silent performance killer as it is before we see any masked rows. It is the time it takes to prepare masking algorithms:
- Cause: Large Secure Lookup and Mapping Algorithms.
- Remedy: Try to use a smaller number of lookup values in these algorithms.
Read Performance
The is the SELECT statement that reads the data from the database. Mostly, this is a minor factor in the whole masking operation.
- Cause 1: Large number of columns with large data objects.
- Remedy: Use Custom SQL to reduce the amount of data read from the database (ex. SELECT 1 as MaskCol....)
- Cause 2: Use of complex Custom SQL.
- Remedy: If used, simplify the Custom SQL as much as possible. Look out for nested SELECT statements.
- Cause 3: Network performance - especially latency.
- Remedy: Check with your Network team to improve latency.
Transformation Performance
This is usually a key part in performance investigation. The key factors here are:
- Cause 1: The number of fields masked together with Algorithms used.
- Algorithms with fewer lookups perform better.
- Remedy 1: Try to use simple algorithms with few lookups as possible.
- Remedy 2: Lookout for Segment Mapping - use few segments and Numeric lookups.
- Cause 2: Masking engine performance.
- Remedy 1: Ensure sufficient number of CPUs on the masking engine.
- Remedy 2: Use Streams in the Job Configuration to perform masking of tables in parallel.
- DO NOT USE TO HIGH NUMBER AND ENSURE ENOUGH RAM TO USE IT.
- WATCH OUT FOR DEADLOCKS.
- Cause 3: Java Garbage Collection (GC).
- Remedy 1: Ensure optimal algorithms (see Cause 1) and an appropriate amount of Job Memory.
- GC will always happen. It is part of JAVA, but:
- Too much Job Memory = longer time to perform GC.
- Too little Job Memory = more frequent GC.
Update Performance
This is another key aspect of performance:
- Cause 1: Database Performance
- Remedy: Ensure that the database has enough power to change all columns and rows with the masked data.
- Factor 1: Storage IO
- Factor 2: CPUs
- Factor 3: RAM to cache data.
- Remedy: Ensure that the database has enough power to change all columns and rows with the masked data.
- Cause 2: Indexes and Triggers on Masked Columns and Tables
- Always look out for masked columns with indexes (in the Inventory and on the database) and Triggers (on the database).
- Remedy: In Job Configuration, select Drop Indexes and Disable Triggers. It might be needed to perform this using Pre-Scripts or on the database.
- Cause 3: Index for Update statement
- A Unique Index is needed to look up the row in the Update statement. It should be Unique, Indexed, and not be masked.
- Remedy: It might be needed to create one if the normal Unique ID is being masked.
- MS SQL: The masking engine will create DMS_ROW_ID.
- Oracle: Use ROWID.
- Cause 4: Use parallelism in Job Configuration;
- Remedy 1: Set Updates Threads to 2 or 4
- Remedy 2: Set Streams to 2, 4 or 8.
- Remedy 3: Run concurrent masking Jobs.
- Watch out for Deadlocks.
- Cause 5: Network performance - especially latency.
- Remedy: Check with your Network team to improve latency.
- Cause 6: Casting of values
- Casting doesn't usually have an impact. If it happens on large objects and they are indexed, this can slow down the job significantly.
- Remedy: Check execution plan on the database. Make changes to improve casting and use of indexes.
Details
Below are more details about:
- Time to complete the Masking Job
- Startup Time - Duration
- Masking Time - Duration
- Performance and number of records
- Read Performance
- Masking Performance
- Update Performance
- Performance and number of records
- Pre and Post Scripts
- Factors that affect performance
- Masking Engine
- Database
Time to complete the Masking Job
The graph above shows examples of time to complete a masking job as a function of:
- Startup time (here 15 min)
- Rows per minute (rpm)
- Number of records
In order to finish the masking job in time, it might be needed to look at all these factors. For a small number of records, the startup time plays a significant role. For a large number of records, the rows per minute is a big factor.
Startup Time - Duration
For most of the time, this is not large enough to affect the overall performance of a masking job. The startup time depends on the algorithm used and the number of rows in the lookup data.
Remedy and consideration
Most algorithms are finely tuned and have been defined with a small set of lookup values, most are not exceeding 50,000 values, and as such, only take a few seconds to load.
When creating a custom algorithm, consider the following:
- Selection of algorithm (minimize lookup algorithms).
- Select as small lookup pools as possible.
Masking Time - Duration
Usually, the number of records to mask is fixed and as such, we need to consider how fast we can perform the masking operation - that is the number of rows per minute (rpm). Usually, this number is specified in hundreds of thousands or millions.
What is acceptable performance depends greatly on how many fields are being masked and how many tables at the same time.
In most cases, good performance starts around 500,000 rpm, a normal optimal performance is around 1 million rpm (or 1 Mrpm). With a very fast database server, highly optimized networks and fine-tuned masking jobs, we have seen jobs running in the excess of 8 million rpm.
Performance and number of records
How fast the masking job, in rpm, is affected by three components:
- Read Performance
- Masking Performance
- Update Performance
See above, in the Summary, for specific details.
Pre and Post Scripts
Pre and Post scripts are also affecting performance and these will add masking time at the start of a job (pre-script) and at the end of a job (post-script).
Usually, these scripts can be tested prior to adding them to the masking job and it is important to ensure that these are as effective as possible.
Remedy and Considerations
Factors Affecting Performance
Masking Engine
On the masking engine, make sure that the following items are as optimized as possible:
- Masking Server - raw power:
- Ensure that the server has plenty of fast cores (CPUs) - it is better to have a few fast CPUs than many slow CPUs.
- Lots of cache memory.
- Memory.
- Masking Jobs - efficiency
- Masking algorithms:
- Fast algorithms or custom ones from Professional Services.
- Mask only key columns as each masking operation adds to the degradation of the performance.
- Rule Set:
- Select an index for row lookup (for Oracle and DB2 use rowid).
- Be mindful of long nested Custom SQL statement and filters.
- Job configuration:
- Masking method: In Place
- Batch Update - selected.
- Bulk Data - not selected.
- Drop Indexes - selected or do this manually.
- Disable Triggers and Constraints - selected.
- Masking algorithms:
- Job Configuration and parallelism:
- Update Threads - important:
- Defines the number of Update Statements running at the same time.
- The database and the masking engine need to be able to handle the extra load.
- Verify what works (2, 3, 4...).
- Larger is not always the fastest - 1 can sometimes be the fastest.
- No. of Streams - important:
- Defines the max number of Tables being masked at the same time.
- The database and the masking engine need to be able to handle the extra load.
- Verify what works (2, 3, 4...).
- Larger is not always the fastest - 1 can sometimes be the fastest.
- Multiple Jobs -
- The job can be broken down into multiple jobs.
- The job can be broken down into multiple jobs.
- Update Threads - important:
- Database Server - raw power
- Check with DBA for optimal settings for
- Storage performance
- RAM, CPU and Cache.
- On Oracle: check AWR report for tuning.
- For deep dive - run one or more SQL traces.
- Check with DBA for optimal settings for
- Network Performance
- Latency - minimize for small and large packets.
- Use
ping
from database server to the masking server.
- Use
- Hops - minimize the number of hops to the server.
- Use
traceroute
from database server to the masking server to check.
- Use
- Throughput - Use Delphix Engine CLI throughput test.
- Latency - minimize for small and large packets.
- Database Configurations
- Turn off database logging.
- For Oracle:
- Provision with 3 x 5GB online redo logs (minimum) to avoid pause when transaction logs wraparound
- Provision in NOARCHIVELOG mode to reduce transaction log IO.
- Always check initialization parameters inherited from the parent, remove any expensive checks or irrelevant settings:
- DB_BLOCK_CHECKSUM default is TYPICAL (default)
- DB_BLOCK_CHECKING default is OFF (default)
- DB_LOST_WRITE_PROTECT default is NONE (default)
- DB_ULTRA_SAFE default value is OFF (default)
- PARALLEL_DEGREE_POLICY to AUTO
- PARALLEL_MAX_SERVERS to default, which is PARALLEL_THREADS_PER_CPU * CPU_COUNT * concurrent_parallel_users * 5
- PARALLEL_EXECUTION_MESSAGE_SIZE to maximum (32768)
- Use non-blocking commits for Masking:
- COMMIT_WAIT=NOWAIT
- COMMIT_LOGGING=BATCH