Masking a Column With a Primary Key (PK) (KBA1776)
KBA
KBA#1776
At a Glance
Summary | This page describes how to mask columns with Primary Key (PK) constraints. Manual tasks are required in order to successfully mask these. Note: If the masked column is an IDENTITY column, please see KBA7266 (see link below). |
|
---|---|---|
Applies to: | These procedures apply to all databases and are verified on the following:
|
|
Algorithm | Referential Integrity is a hard requirement when masking a Primary Key column. The Algorithm needs to be one of the following:
|
|
Steps | Below are steps for Oracle and all other databases. | |
Oracle | On a Continuous Compliance Engine (aka Masking) from version 6.0.11.0, the engine will automatically manage this using the Enable Tasks. The only step is:
For all other databases and versions, see the steps below. |
|
All other | The below steps are for all Database Connectors (other than Oracle):
|
|
Constraints | How to disable/drop PK differs from database to database. As an example:
Note: The masking engine might not be able to drop/disable some PKs. This might need to be done manually. |
|
More info | For troubleshooting help:
For information about other column types and Rule Sets: |
Applicable Delphix Versions
- Click here to view the versions of the Delphix engine to which this article applies
-
Date Release Mar 13, 2023 9.0.0.0, 9.0.0.1 Feb 13, 2023 8.0.0.0 Jan 12, 2023 7.0.0.0 Releases Prior to 2023 Major Release All Sub Releases 6.0 6.0.0.0, 6.0.1.0, 6.0.1.1, 6.0.2.0, 6.0.2.1, 6.0.3.0, 6.0.3.1, 6.0.4.0, 6.0.4.1, 6.0.4.2, 6.0.5.0, 6.0.6.0, 6.0.6.1, 6.0.7.0, 6.0.8.0, 6.0.8.1, 6.0.9.0, 6.0.10.0, 6.0.10.1, 6.0.11.0, 6.0.12.0, 6.0.12.1, 6.0.13.0, 6.0.13.1, 6.0.14.0, 6.0.15.0, 6.0.16.0, 6.0.17.0, 6.0.17.1, 6.0.17.2
5.3
5.3.0.0, 5.3.0.1, 5.3.0.2, 5.3.0.3, 5.3.1.0, 5.3.1.1, 5.3.1.2, 5.3.2.0, 5.3.3.0, 5.3.3.1, 5.3.4.0, 5.3.5.0, 5.3.6.0, 5.3.7.0, 5.3.7.1, 5.3.8.0, 5.3.8.1, 5.3.9.0 5.2
5.2.2.0, 5.2.2.1, 5.2.3.0, 5.2.4.0, 5.2.5.0, 5.2.5.1, 5.2.6.0, 5.2.6.1
5.1
5.1.0.0, 5.1.1.0, 5.1.2.0, 5.1.3.0, 5.1.4.0, 5.1.5.0, 5.1.5.1, 5.1.6.0, 5.1.7.0, 5.1.8.0, 5.1.8.1, 5.1.9.0, 5.1.10.0
5.0
5.0.1.0, 5.0.1.1, 5.0.2.0, 5.0.2.1, 5.0.2.2, 5.0.2.3, 5.0.3.0, 5.0.3.1, 5.0.4.0, 5.0.4.1, 5.0.5.0, 5.0.5.1, 5.0.5.2, 5.0.5.3, 5.0.5.4
Masking Columns with PK Constraints
Below are further details about PK and how to mask these columns. Since the PK will require additional actions, these cannot be disabled using the Job Configuration checkbox. Primary Keys are also frequently referenced by Foreign Keys (see KBA7286 for more details about masking these).
Delphix is working on automating these steps. Oracle is the first database with automated steps (from Delphix version 6.0.11.0).
From Delphix Technical Services: Primary keys are often an internal identifier and may not require masking since the PII associated with them is masked. Changing these values does little to add security as long as the associated PII is masked. Furthermore, masking Primary and Foreign Keys often causes issues for test cases as well as adds complexity to masking applications. An application owner familiar with test cases in lower environments should be consulted prior to expending the effort to mask these. |
Salient points
It is important to note:
- The Primary Key can be composite - all masked columns must adhere to the key constraint.
- All related Keys (PK and FKs) need to be masked.
- The Masking Algorithm must have Referential Integrity and 1:1 unique Mapping.
- PK Constraints need to be dropped before masking and recreated after.
- Masking PKs do affect Use Cases, Test Cases, and Down Stream Applications.
- Talk to the Application Owner.
About Primary Keys (PK)
-
A Primary Key is a column (or a group of columns) that uniquely identifies each record in a table.
-
Primary Keys must contain unique values, and cannot contain null values.
-
A table can have only one primary key, which may consist of single or multiple columns.
- A Primary key is a special Constraint - special as it has special features and it also differs from database to database.
Resolution
Since the masking engine does not know about all PK Constraints and the desired behavior depends on masking requirements, the drop/disable of PK/FK Constraints needs to be scripted and manually dropped/disabled.
Actions
Prior to Masking
- Define the appropriate algorithms for the masked columns.
- Get the list of all columns with PK and FK required to be masked.
- All FK Constraints referencing a column that is masked need to be masked.
- All PK Constraints referenced by a column that is masked need to be masked.
- Manually Drop/Disable these FK and PK Constraints.
- Refresh the Rule Set before masking.
Post Masking
- Depending on circumstances and requirements - it might be possible to skip recreating PK and FK Constraints after masking
Drop or disable PK?
The drop or disable of PK needs to be performed manually (even if some databases allow disabling these):
- DB2 - PK can be dropped and from v10.5 it can be set as 'not enforced'.
- MS SQL Server - PK can only be dropped.
- MySQL - PK can be dropped or disabled.
- Oracle - PK can be dropped or disabled.
- Sybase - PK can only be dropped.
For assistance with scripting this, please contact your Customer Success Manager and Technical Services.
Oracle
From Masking version 6.0.11, Delphix has automated masking of PK on Oracle.
The only step needed is - both Constraints and Indexes are needed since a PK is constructed using both.
- In the Job Configuration - open Enable Tasks and select Drop Indexes and Disable Constraints.
PostgreSQL
When masking PostgreSQL, it might be possible to use the system column 'ctid' instead of creating an IDENTITY column.
- The PK needs to still be dropped/disabled.
- Define 'ctid' as a Logical Key.
Errors and Issues
When masking columns you may encounter the following issues/errors:
- Constraint Violations
- Slow Performance and Log Size Growth
Common related errors:
- Oracle: ORA-00001, ORA-02291, ORA-02292.
- MS SQL: Violation of PRIMARY KEY constraint, Foreign key constraint violated.
Constraint Violations
For the PK and FK Constraints we can have a number of different types of database errors:
- PK Constraint violation:
- The masked value is already in the column.
For example: masking value x to y and y is already in the column. - The masked values are not unique.
For example: masking value x to y and already masked z to y.
- The masked value is already in the column.
- FK Constraint violation:
- PK change violates FK Constraint.
For example: masking x to y in the parent table but there is an FK that relies upon that there is x in PK. - FK change violates FK Constraint.
For example: masking x to y in the child table there is no y is in the PK.
- PK change violates FK Constraint.
Slow Performance and Log Size Growth
When masking data, generally all rows in the table will be masked. As a result, for any enabled PK, Unique Key, and FK Constraints, with their associated Indexes, the database has to maintain these while masking. The database also has to generate additional information about the masking changes in the form of undo and redo (Oracle) and transaction logs (examples MS SQL Server, Sybase, DB2) as there will be changes on tables and any indexes associated with the constraints. This potentially makes the masking slower and the database will require more storage to be made available to track Masking Engine generated changes to data.
Resolution
Drop Indexes on Masked Columns.
Database with PK and FK Example
Consider the following diagram of an HRS (Human Resource System) for a moment.
Masking Last Name, Department Name, and similar fields are easy and straightforward. Just define the Rule Set and the Job and the masking job will use the PK as the Unique Row Identifier and we are in business.
To mask the columns DEPNO and EMPNO there are a few challenges:
- There is no longer a URI (Unique Row Identifier) as we are masking the PK.
- Due to the PK Uniqueness Constraint, the PK needs to be unique after masking.
- Also, the FK needs to be masked to match the PK.
- Then there is the cross reference where the two tables reference each other - which table fo you start with?
Now take a look at this example:
In the example above, we have more FK references and another system (Payroll System) accessing the data. We need to mask all these columns if we are to mask 'DEPTNO' and 'EMPNO' and if we do not drop/disable the PK and FK Constraints we will have errors.
Review the following types of errors and issues that can occur:
Examples: PK Constraint Violation
This example will show two different types of Constraint Violations possible when masking a Primary Key:
- The masked value is already in the column.
- The algorithm is not producing a 1:1 unique mapping (there are duplicates in the masked data).
The database error message for these two root causes is the same, so let's start looking at what can cause these errors.
Background details
To visualize this, let's create an example. This example will show both types of violations by using an algorithm that is not producing a 1:1 unique mapping (Secure Lookup). The allocation of the masked value is random and the same masked value can appear more than once.
The configuration
The masking job is configured as follows:
- Table Columns:
- 'ID' (logical key)
- 'Mask PK' (masked)
- Algorithm: Secure Lookup (with 25 values).
- Commit Size: 4
Explanation
What happens here?
- We Commit in groups of 4 rows.
- In this example, the error happens in the second Commit.
Steps:
- Commit 1: Success - the first 4 masked values are Committed ok.
- Note that row with ID: '1' is masked to '2'. This will later violate a duplicated masked value.
- Note also that the ID order has changed. More about that in an example further down.
- Commit 2: Error - the second lot of 4 values are masked and the Commit is failing.
- We are masking '6' to '9' and '4' to '2'.
Both these violate the constraint for two different reasons.- UPDATE example 1.
- The masked value '9' is already in the table and has not been masked yet.
- We can never allocate '9' as this will violate the PK constraint.
- UPDATE example 2:
- The masked value '2' is also already in the table.
- Since the mapping is not 1:1 and unique. '1' is masked to '2" and so is '4' now as well.
- UPDATE example 1.
- We are masking '6' to '9' and '4' to '2'.
# Original Table # Masked Table SELECT * FROM Msk_Tbl; SELECT * FROM Msk_Tbl; +----+---------+ +----+---------+ |
Error messages
Error messages from different database types. The two scenarios above have the same error message:
- DB2
- 'SQLSTATE: 23505. A violation of the constraint imposed by a unique index or a unique constraint occurred.'
- 'SQL0803N One or more values in the INSERT statement, UPDATE statement, or foreign key update caused by a DELETE statement are not valid because the primary key, unique constraint or unique index identified by "[VALUE]" constrains table "[TABLE]" from having duplicate values for the index key. SQLSTATE=23505'
- 'SQLSTATE: 23505. A violation of the constraint imposed by a unique index or a unique constraint occurred.'
- MS SQL Server
- 'Violation of PRIMARY KEY constraint '[PK_CONSTRAINT]'. Cannot insert duplicate key in object '[TABLE]'. The duplicate key value is ([VALUE]).'
- 'Violation of PRIMARY KEY constraint '[PK_CONSTRAINT]'. Cannot insert duplicate key in object '[TABLE]'. The duplicate key value is ([VALUE]).'
- MySQL
- 'ERROR 1062 (23000): Duplicate entry '[NEW_VALUE]' for key [VALUE]'
- 'ERROR 1062 (23000): Duplicate entry '[NEW_VALUE]' for key [VALUE]'
- Oracle
- 'ORA-00001: unique constraint ([PK_CONSTRAINT]) violated'
- 'ORA-00001: unique constraint ([PK_CONSTRAINT]) violated'
- Sybase
- 'Check constraint violation occurred, dbname = [DATABASE], table name = [TABLE], constraint name = [PK_CONSTRAINT].'
Resolution
To solve this issue:
On all masked columns with PK:
- Use an algorithm that produces a 1:1 unique mapping - for example, Character Mapping, Mapping Algorithm, or a Custom Algorithm.
- Manually Drop or Disable the PK.