Skip to main content
Delphix

Masking a Column With an Index (KBA7295)

 

 

KBA

KBA# 7295

 

Applicable Delphix Versions

Click here to view the versions of the Delphix engine to which this article applies
Major Release All Sub Releases
6.0 6.0.0.0, 6.0.1.0, 6.0.1.1, 6.0.2.0, 6.0.2.1, 6.0.3.0, 6.0.3.1, 6.0.4.0, 6.0.4.1, 6.0.4.2, 6.0.5.0, 6.0.6.0, 6.0.6.1, 6.0.7.0

5.3

5.3.0.0, 5.3.0.1, 5.3.0.2, 5.3.0.3, 5.3.1.0, 5.3.1.1, 5.3.1.2, 5.3.2.0, 5.3.3.0, 5.3.3.1, 5.3.4.0, 5.3.5.0, 5.3.6.0, 5.3.7.0, 5.3.7.1, 5.3.8.0, 5.3.8.1, 5.3.9.0

5.2

5.2.2.0, 5.2.2.1, 5.2.3.0, 5.2.4.0, 5.2.5.0, 5.2.5.1, 5.2.6.0, 5.2.6.1

5.1

5.1.0.0, 5.1.1.0, 5.1.2.0, 5.1.3.0, 5.1.4.0, 5.1.5.0, 5.1.5.1, 5.1.6.0, 5.1.7.0, 5.1.8.0, 5.1.8.1, 5.1.9.0, 5.1.10.0

5.0

5.0.1.0, 5.0.1.1, 5.0.2.0, 5.0.2.1, 5.0.2.2, 5.0.2.3, 5.0.3.0, 5.0.3.1, 5.0.4.0, 5.0.4.1, 5.0.5.0, 5.0.5.1, 5.0.5.2, 5.0.5.3, 5.0.5.4

At a Glance

Summary: This KBA describes best practices and general considerations for masking a column with an Index. The possible issues are performance issues, constraint errors, index errors, and unique key errors. 

Manual tasks might be required on some databases.
Applies to:  These procedures apply to all databases and are verified on the following: 
  • DB2 LUW*
  • Oracle*
  • MS SQL Server*
  • MySQL*
  • Maria DB*
  • SAP ASE (Sybase)*
     
  • PostgreSQL
  • DB2 Z/OS and iSeries
  • Custom Connector 

* indicates the Drop Indexes feature is available in the Job Configuration. Note, some indexes might still require manual tasks to drop. 

ME Feature: Delphix Continuous Compliance Engine has a built-in feature for dropping and recreating Indexes on (see above) databases.
Manual - Steps: Below are database agnostic steps on how to drop, mask, and recreate an index:
  1. Identify columns with indexes (or constraints with indexes) using the Vendor's Database tool or query.
  2. Ensure that Drop Indexes is not enabled in the Masking Job. 
  3. Drop the index(s) (or Constraints).
  4. Run the Masking Job.
  5. Recreate the index(s). 
Unique Indexes (Constraints): If uniqueness is required when masking an Indexed column, use one of the following Algorithms: 
  • Character Mapping 
  • Segment Mapping
  • Mapping Algorithm
  • Custom Algorithm developed specifically for a Key (Primary or Foreign) column
More Info: More info on how to access logs: 

Indexes and Variations

Indexes can be both good and bad for a masking job. Having an Index on the Unique Row Identifier (URI) is almost required for good performance - especially for larger tables. A Clustered Index on the URI can also help to minimize Deadlocks on a table. 

Indexes come in different types and can also have Features specific to different databases - for example, in Oracle, a Unique Constraint is usually enforced using an Index.

If there is an Index (or Constraint) on a Masked Column - this will very likely have a negative impact on performance. It can also cause errors, like the case with Oracle and the Unique Constraint. 

This page has been created to cover some of these issues and errors. 

Issues and Errors

Below are some examples of issues and errors encountered with Indexes and the Drop Indexes features. 

Performance issues

The main reason for dropping indexes on a Masked Column is performance. 

When values are masked, the row is updated. If there is an index on the column, then the index needs to be updated as well, which will degrade the performance. The amount of degradation depends on the type of Index. Mostly it is minimal, but it can also be costly.

For example, if the masked column has a Clustered Index, the data is also sorted in a specified order and each updated value will generate a re-order of the table. See 'Example: Masking column with Clustered Index' below as an example. 

The re-order will happen on Oracle if the Index is Index-Organized Table (IOT).

Constraint Errors

Some databases use Indexes to enforce Unique Constraints (Oracle for example). In these cases, there can be an error at two different stages. In both occasions, the root cause is the same - the masked data is not unique.

Error while masking table - Drop Indexes = Off 

If the Index on the column enforces the Unique Constraint and the masked data is not unique, the masking job will fail with a Unique Constraint Violation error:

Example - Oracle:

[JOB_ID_yy_xx] - Update.0 - ORA-00001: unique constraint (...) violated

Error after masking finished - Drop Indexes = On

If 'Drop Indexes' is selected, or the Indexes are created manually or in a Post-Script, and there are duplicated values in the masked data, the following error (or similar) will be shown:

Example - Oracle:

[JOB_ID_yy_xx] - Execute_Create_Index.0 - ERROR: An error occurred, processing will be stopped:
[JOB_ID_yy_xx] - Execute_Create_Index.0 - Couldn't execute SQL: CREATE UNIQUE INDEX ...
...
[JOB_ID_yy_xx] - Execute_Create_Index.0 - ORA-01452: cannot CREATE UNIQUE INDEX; duplicate keys found

Resolution 

In order to mask a column with a Unique Constraint, the Algorithm has to have Referential Integrity and 1:1 Mapping. The algorithm will, therefore, need to be one of the following: 

  • Character Mapping Algorithm (replaces Segment Mapping)
  • Mapping Algorithm
  • Custom Algorithm developed specifically for a Key (Primary or Foreign) column

Note: On these indexes, it is recommended to start executing the job with Drop Indexes = Off. The reason for this is that the job will fail as soon as there is an error and it will indicate the violation. When the job works, the Drop Indexes can be set to On for extra performance gain. 

Function-Based or Computed Index Errors

Function-Based and Computed Indexes are similar but are implemented differently. These indexes use a database function to alter the data being indexed. For example, in Oracle, these indexes are called Function-Based Indexes and in SQL Server and Sybase they are called Computed Indexes.

The root cause for these errors depends on the function used on the index. 

Error while masking table - Drop Indexes = Off 

If the Index on the column enforces the Unique Constraint and the masked data is not unique, the masking job will fail with a Unique Constraint Violation error:

Example - Oracle:

[JOB_ID_yy_xx] - Update.0 - ORA-00001: unique constraint (...) violated

Error after masking finished - Drop Indexes = On

If Drop Indexes is selected, or the Indexes are created manually or in a Post-Script, and there are duplicated values in the masked data, the following error (or similar) will be shown:

Example - Oracle:

[JOB_ID_yy_xx] - Execute_Create_Index.0 - ERROR: An error occurred, processing will be stopped:
[JOB_ID_yy_xx] - Execute_Create_Index.0 - Couldn't execute SQL: CREATE UNIQUE INDEX ...
...
[JOB_ID_yy_xx] - Execute_Create_Index.0 - ORA-01452: cannot CREATE UNIQUE INDEX; duplicate keys found

Resolution 

The resolution depends on the function used in the Index. It might be that the masked data needs to have specific characteristics or be unique (see above for algorithms). 

  • Check the index on the database. 
  • Manually drop the index on the masked column.
  • Mask.  
  • Manually recreate the index on the masked column. 

Note: In this case too, it is recommended to start executing the job with the Drop Indexes = Off. The reason for this is that the job will fail as soon as there is an error and it will indicate the violation. When the job works, the Drop Indexes can be set to On for extra performance gain. 

Privilege Errors 

The right privileges are needed in order to Drop and Create Indexes. The Index (or any other object) can also have been created using a different user/schema other than the user/schema for the table. 

The Masking User, therefore, needs to have the correct privileges to perform these actions. 

Minimum privileges required are 'Create Session', 'Select', 'Update', 'Insert', and 'Delete'.  For Drop Indexes (or Disable Triggers/Constraint), the Masking User will require DBA level permissions.

Example: Masking column with Clustered Index

This example shows how data in a masked column with a Clustered Index is causing the table to be sorted. This will affect performance. It will likely not affect storage as the data pages will already change due to the masked data. 

The example below is created using MS SQL Server. 

The configuration 

The masking job is configured as follows:

  • Table Columns:
    • 'ID' (logical key)
    • 'Mask' (masked) (with Clustered Index ASC)
  • Algorithm: 'Mapping Algorithm' (with 8 values (1 to 8))
  • Commit Size: 4
Explanation 

What happens here?

  • The ORDER happens after each UPDATE.
  • This has an impact on performance. 

Events (refer to the tables below):

  • UPDATE 1: A to 6  > Same order as 6 is still first.
  • UPDATE 2: B to 4  > 4 is now first.
  • UPDATE 3: C to 7  
  • UPDATE 4: D to 1  
  • COMMIT (*)
  • UPDATE 5: E to 5  
  • UPDATE 6: F to 8  
  • UPDATE 7: G to 3  
  • UPDATE 8: H to 2  > The table now has a completely new order. 
  • COMMIT (*)

# Original table   # Masked table     # Masked table       # Masked table  
# before masking   # UPDATE 1         # UPDATE 2           # UPDATE 8
----------------   ----------------   ---------------- ... ----------------

SELECT *           SELECT *           ...                  ...
FROM Msk;          FROM Msk;

+----+------+      +----+------+      +----+------+        +----+------+
| ID | Mask |      | ID | Mask |      | ID | Mask |        | ID | Mask |
+----+------+      +----+------+      +----+------+        +----+------+
| 1  | A    |      | 1  | 6    |      | 2  | 4    |        | 4  | 1    |
| 2  | B    |      | 2  | B    |      | 1  | 6    |        | 8  | 2    |
| 3  | C    |      | 3  | C    |      | 3  | C    |        | 7  | 3    |
| 4* | D    |      | 4* | D    |      | 4* | D    |        | 2  | 4    |
+----+------+      +----+------+      +----+------+        +----+------+
| 5  | E    |      | 5  | E    |      | 5  | E    |        | 5  | 5    |
| 6  | F    |      | 6  | F    |      | 6  | F    |        | 1  | 6    |
| 7  | G    |      | 7  | G    |      | 7  | G    |        | 3  | 7    |
| 8* | H    |      | 8* | H    |      | 8* | H    |        | 6  | 8    |
+----+------+      +----+------+      +----+------+        +----+------+

Resolving this issue

For most databases, the solution here is to, on the Job Configuration pop-up, check Drop Indexes.

 

For databases without this feature, or if needed to be executed manually:

  • Manually drop the Clustered Index on the masked column.
  • Mask.  
  • Manually recreate the Clustered Index on the masked column. 

 

Note: Some Clustered Indexes might be added as Constraints - these might not be automatically Disabled and has to be manually removed using ALTER TABLE

 

 

 

 


Related Articles

The following articles may provide more information or related information to this article: