Skip to main content
Delphix

Masking a Column With a Primary Key (PK) (KBA1776)

 

 

KBA

KBA#1776

 

At a Glance

Summary This page describes how to mask columns with Primary Key (PK) constraints. Manual tasks are required in order to successfully mask these.

Note: If the masked column is an IDENTITY column, please see KBA7266 (see link below).
Applies to: These procedures apply to all databases and are verified on the following: 
  • Oracle
  • MS SQL Server
  • SAP ASE (Sybase)
  • MySQL
  • PostgreSQL
  • DB2
  • Maria DB
Algorithm Referential Integrity is a hard requirement when masking a Primary Key column.

The Algorithm needs to be one of the following: 
  • OOTB: Character Mapping Algorithm 
  • OOTB: Segment Mapping (v1 and v2)
  • OOTB: Mapping Algorithm (v1 and v2)
  • Custom Algorithm developed specifically for a Key (Primary or Foreign) column
Steps Below are steps for Oracle and all other databases.
  Oracle On a Continuous Compliance Engine (aka Masking) from version 6.0.11.0, the engine will automatically manage this using the Enable Tasks.

The only step is:
  1. In the Job Configuration - open Enable Tasks and select Drop Indexes and Disable Constraints.

For all other databases and versions, see the steps below.

  All other The below steps are for all Database Connectors (other than Oracle):
  1. Manually:
    1. Drop any Index on the PK and drop/disable the PK Constraint.
    2. Create a temporary IDENTITY column (i.e. 'DLPX_TMP_ID')
    3. AddClustered Index to this column (the best option is a Clustered Index).
  2. Refresh the Rule Set.
  3. Assign the Masking Algorithm to the PK column (see above for algorithms).
  4. Run the Masking Job.
  5. Manually:
    1. Drop the temporary IDENTITY column (starting with its Index).
    2. Recreate the PK and its Index.
Constraints How to disable/drop PK differs from database to database. As an example:
  • DB2 - PK can be dropped. From v10.5, PK can be set as 'not enforced'
  • MS SQL Server - PK can only be dropped
  • MySQL - PK can be dropped or disabled
  • Oracle - PK can be dropped or disabled 
  • Sybase - PK can only be dropped

Note: The masking engine might not be able to drop/disable some PKs. This might need to be done manually.

More info For troubleshooting help:

For information about other column types and Rule Sets: 

Applicable Delphix Versions

Click here to view the versions of the Delphix engine to which this article applies
Date Release
Mar 13, 2023 9.0.0.0, 9.0.0.1
Feb 13, 2023 8.0.0.0
Jan 12, 2023 7.0.0.0
Releases Prior to 2023
Major Release All Sub Releases
6.0

6.0.0.0, 6.0.1.0, 6.0.1.1, 6.0.2.0, 6.0.2.1, 6.0.3.0, 6.0.3.1, 6.0.4.0, 6.0.4.1, 6.0.4.2, 6.0.5.0, 6.0.6.0, 6.0.6.1, 6.0.7.0, 6.0.8.0, 6.0.8.1, 6.0.9.0, 6.0.10.0, 6.0.10.1, 6.0.11.0, 6.0.12.0, 6.0.12.1, 6.0.13.0, 6.0.13.1, 6.0.14.0, 6.0.15.0, 6.0.16.0, 6.0.17.0, 6.0.17.1, 6.0.17.2

5.3

5.3.0.0, 5.3.0.1, 5.3.0.2, 5.3.0.3, 5.3.1.0, 5.3.1.1, 5.3.1.2, 5.3.2.0, 5.3.3.0, 5.3.3.1, 5.3.4.0, 5.3.5.0, 5.3.6.0, 5.3.7.0, 5.3.7.1, 5.3.8.0, 5.3.8.1, 5.3.9.0

5.2

5.2.2.0, 5.2.2.1, 5.2.3.0, 5.2.4.0, 5.2.5.0, 5.2.5.1, 5.2.6.0, 5.2.6.1

5.1

5.1.0.0, 5.1.1.0, 5.1.2.0, 5.1.3.0, 5.1.4.0, 5.1.5.0, 5.1.5.1, 5.1.6.0, 5.1.7.0, 5.1.8.0, 5.1.8.1, 5.1.9.0, 5.1.10.0

5.0

5.0.1.0, 5.0.1.1, 5.0.2.0, 5.0.2.1, 5.0.2.2, 5.0.2.3, 5.0.3.0, 5.0.3.1, 5.0.4.0, 5.0.4.1, 5.0.5.0, 5.0.5.1, 5.0.5.2, 5.0.5.3, 5.0.5.4

Masking Columns with PK Constraints

Below are further details about PK and how to mask these columns. Since the PK will require additional actions, these cannot be disabled using the Job Configuration checkbox. Primary Keys are also frequently referenced by Foreign Keys (see KBA7286 for more details about masking these). 

Delphix is working on automating these steps. Oracle is the first database with automated steps (from Delphix version 6.0.11.0).

 

Warning

Warning:

Masking Primary Keys (PK) and Foreign Keys (FK) requires Manual steps and can be tricky. PK should only be masked if necessary.

 

From Delphix Technical Services:

Primary keys are often an internal identifier and may not require masking since the PII associated with them is masked.  Changing these values does little to add security as long as the associated PII is masked.

Furthermore, masking Primary and Foreign Keys often causes issues for test cases as well as adds complexity to masking applications. An application owner familiar with test cases in lower environments should be consulted prior to expending the effort to mask these.

Salient points 

It is important to note:

  1. The Primary Key can be composite - all masked columns must adhere to the key constraint.
  2. All related Keys (PK and FKs) need to be masked.
  3. The Masking Algorithm must have Referential Integrity and 1:1 unique Mapping.
  4. PK Constraints need to be dropped before masking and recreated after. 
  5. Masking PKs do affect Use Cases, Test Cases, and Down Stream Applications. 
    • Talk to the Application Owner.

About Primary Keys (PK) 

  • A Primary Key is a column (or a group of columns) that uniquely identifies each record in a table.

  • Primary Keys must contain unique values, and cannot contain null values.

  • A table can have only one primary key, which may consist of single or multiple columns.

  • A Primary key is a special Constraint - special as it has special features and it also differs from database to database.

Resolution

Since the masking engine does not know about all PK Constraints and the desired behavior depends on masking requirements, the drop/disable of PK/FK Constraints needs to be scripted and manually dropped/disabled. 

Actions

Prior to Masking 

  • Define the appropriate algorithms for the masked columns.
  • Get the list of all columns with PK and FK required to be masked.
    • All FK Constraints referencing a column that is masked need to be masked.
    • All PK Constraints referenced by a column that is masked need to be masked. 
  • Manually Drop/Disable these FK and PK Constraints. 
  • Refresh the Rule Set before masking.

Post Masking

  • Depending on circumstances and requirements - it might be possible to skip recreating PK and FK Constraints after masking

Drop or disable PK? 

The drop or disable of PK needs to be performed manually (even if some databases allow disabling these):

  • DB2 - PK can be dropped and from v10.5 it can be set as 'not enforced'. 
  • MS SQL Server - PK can only be dropped. 
  • MySQL - PK can be dropped or disabled. 
  • Oracle - PK can be dropped or disabled. 
  • Sybase - PK can only be dropped. 

 

For assistance with scripting this, please contact your Customer Success Manager and Technical Services. 

 

Oracle 

From Masking version 6.0.11, Delphix has automated masking of PK on Oracle. 

The only step needed is - both Constraints and Indexes are needed since a PK is constructed using both.

  1. In the Job Configuration - open Enable Tasks and select Drop Indexes and Disable Constraints.

PostgreSQL 

When masking PostgreSQL, it might be possible to use the system column 'ctid' instead of creating an IDENTITY column.

  • The PK needs to still be dropped/disabled. 
  • Define 'ctid' as a Logical Key. 

Errors and Issues 

When masking columns you may encounter the following issues/errors:

  • Constraint Violations
  • Slow Performance and Log Size Growth

 

Common related errors: 

  • Oracle: ORA-00001, ORA-02291, ORA-02292.
  • MS SQL: Violation of PRIMARY KEY constraint, Foreign key constraint violated.

Constraint Violations 

For the PK and FK Constraints we can have a number of different types of database errors: 

  • PK Constraint violation:
    • The masked value is already in the column.
      For example: masking value x to y and y is already in the column. 
    • The masked values are not unique.  
      For example: masking value x to y and already masked z to y. 
  • FK Constraint violation:
    • PK change violates FK Constraint.
      For example: masking x to y in the parent table but there is an FK that relies upon that there is x in PK. 
    • FK change violates FK Constraint.
      For example: masking x to y in the child table there is no y is in the PK. 

Slow Performance and Log Size Growth

When masking data, generally all rows in the table will be masked.  As a result, for any enabled PK, Unique Key, and FK Constraints, with their associated Indexes, the database has to maintain these while masking. The database also has to generate additional information about the masking changes in the form of undo and redo (Oracle) and transaction logs (examples MS SQL Server, Sybase, DB2) as there will be changes on tables and any indexes associated with the constraints. This potentially makes the masking slower and the database will require more storage to be made available to track Masking Engine generated changes to data.

Note

Note:

 

If the masked column is Indexed the performance will be slower since both the column and the index need to be updated. Drop all indexes on masked columns. 

 

Resolution

Drop Indexes on Masked Columns. 

Database with PK and FK Example

Consider the following diagram of an HRS (Human Resource System) for a moment.

Masking - Mask PK FK HRS.png

Masking Last Name, Department Name, and similar fields are easy and straightforward. Just define the Rule Set and the Job and the masking job will use the PK as the Unique Row Identifier and we are in business.   

To mask the columns DEPNO and EMPNO there are a few challenges: 

  • There is no longer a URI (Unique Row Identifier) as we are masking the PK.
  • Due to the PK Uniqueness Constraint, the PK needs to be unique after masking.
  • Also, the FK needs to be masked to match the PK.
  • Then there is the cross reference where the two tables reference each other - which table fo you start with?

Now take a look at this example:

Masking_-_Mask_PK_FK_HRS_Payroll.png

In the example above, we have more FK references and another system (Payroll System) accessing the data. We need to mask all these columns if we are to mask 'DEPTNO' and 'EMPNO' and if we do not drop/disable the PK and FK Constraints we will have errors. 

Review the following types of errors and issues that can occur:

Examples: PK Constraint Violation

This example will show two different types of Constraint Violations possible when masking a Primary Key:

  • The masked value is already in the column.
  • The algorithm is not producing a 1:1 unique mapping (there are duplicates in the masked data).

The database error message for these two root causes is the same, so let's start looking at what can cause these errors. 

Background details 

To visualize this, let's create an example. This example will show both types of violations by using an algorithm that is not producing a 1:1 unique mapping (Secure Lookup). The allocation of the masked value is random and the same masked value can appear more than once.

The configuration 

The masking job is configured as follows:

  • Table Columns:
    • 'ID' (logical key)
    • 'Mask PK' (masked)
  • Algorithm: Secure Lookup (with 25 values).
  • Commit Size: 4

Explanation 

What happens here?

  • We Commit in groups of 4 rows.
  • In this example, the error happens in the second Commit.

Steps:

  • Commit 1: Success - the first 4 masked values are Committed ok.
    • Note that row with ID: '1' is masked to '2'. This will later violate a duplicated masked value.
    • Note also that the ID order has changed. More about that in an example further down.
  •  Commit 2: Error - the second lot of 4 values are masked and the Commit is failing. 
    • We are masking '6' to '9' and '4' to '2'.
      Both these violate the constraint for two different reasons.
      • UPDATE example 1.
        • The masked value '9' is already in the table and has not been masked yet.
        • We can never allocate '9' as this will violate the PK constraint. 
      • UPDATE example 2:
        • The masked value '2' is also already in the table.
        • Since the mapping is not 1:1 and unique. '1' is masked to '2" and so is '4' now as well. 

Original Table                 # Masked Table
# before masking                 # after masking

SELECT * FROM Msk_Tbl;           SELECT * FROM Msk_Tbl;

+----+---------+                 +----+---------+
ID | Mask PK |                 | ID | Mask PK |
+----+---------+    >> Mask >>   +----+---------+
| 0  | 0       |                 | 3  | 1       |
| 1  | 1       |                 | 1  | 2*      | < Example 2 violates this value (2)
| 2  | 2       |                 | 0  | 15      |
| 3  | 3       |      Commit 1 > | 2  | 21      |
+----+---------+                 +----+---------+
| 4  | 4       |         Error > | 4  | 9*      | < UPDATE example 1
| 5  | 5       |                 | 5  | 4       |   Value (9) already in the column.
| 6  | 6       |         Error > | 6  | 2*      | < UPDATE example 2
| 7  | 7       |      Commit 2 > | 7  | 16      |   Masked value (2) duplicate.
+----+---------+          Stop > +----+---------+
| 8  | 8       |                 | 8  | 8       |
| 9  | 9       |                 | 9  | 9*      | < Example 1 violates this value (9)
| .. | ..      |                 | .. | ..      |
+----+---------+                 +----+---------+

Error messages 

Error messages from different database types. The two scenarios above have the same error message:

  • DB2
    • 'SQLSTATE: 23505. A violation of the constraint imposed by a unique index or a unique constraint occurred.'
       
    • 'SQL0803N One or more values in the INSERT statement, UPDATE statement, or foreign key update caused by a DELETE statement are not valid because the primary key, unique constraint or unique index identified by "[VALUE]" constrains table "[TABLE]" from having duplicate values for the index key. SQLSTATE=23505'
       
  • MS SQL Server
    • 'Violation of PRIMARY KEY constraint '[PK_CONSTRAINT]'. Cannot insert duplicate key in object '[TABLE]'. The duplicate key value is ([VALUE]).'
       
  • MySQL
    • 'ERROR 1062 (23000): Duplicate entry '[NEW_VALUE]' for key [VALUE]'
       
  • Oracle
    • 'ORA-00001: unique constraint ([PK_CONSTRAINT]) violated'
       
  • Sybase
    • 'Check constraint violation occurred, dbname = [DATABASE], table name = [TABLE], constraint name = [PK_CONSTRAINT].'

Resolution 

To solve this issue:

On all masked columns with PK:

  1. Use an algorithm that produces a 1:1 unique mapping - for example, Character MappingMapping Algorithm, or a Custom Algorithm.
  2. Manually Drop or Disable the PK.