Analyze Non-Conforming Data in Masking (KBA5039)
KBA
KBA# 5039At a Glance
Description: | This page describes how to analyze Non-Conforming data Warnings/Errors. | ||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Non-Conforming Classification: |
The character Classification used on the Masking Engine is matching Oracle RegEx (code shown):
For details about the Unicode characters please look here (or look below): |
||||||||||||||||||||||||||||||||||||
Reporting: | Warnings and Errors are reported on:
|
||||||||||||||||||||||||||||||||||||
Note: Due to bug DLPX-85867 - the Job Monitor might incorrectly show all objects with incorrect status. This issue is resolved in version 11.0. |
|||||||||||||||||||||||||||||||||||||
Configuration - Status Code: | The status code can be set to Fail or Successful should there be Non-Conforming data in a column. This can be
|
||||||||||||||||||||||||||||||||||||
Configuration - Job Execution: | To stop masking the table which contains Non-Confirming data.
|
||||||||||||||||||||||||||||||||||||
Troubleshooting: | Troubleshooting tips:
How to find Non-Conforming Data on different databases: |
Applicable Delphix Versions
- Click here to view the versions of the Delphix engine to which this article applies
-
Date Release Feb 13, 2023 8.0.0.0 Jan 12, 2023 7.0.0.0 Releases Prior to 2023 Major Release All Sub Releases 6.0 6.0.0.0, 6.0.1.0, 6.0.1.1, 6.0.2.0, 6.0.2.1, 6.0.3.0, 6.0.3.1, 6.0.4.0, 6.0.4.1, 6.0.4.2, 6.0.5.0, 6.0.6.0, 6.0.6.1, 6.0.7.0, 6.0.8.0, 6.0.8.1, 6.0.9.0, 6.0.10.0, 6.0.10.1, 6.0.11.0, 6.0.12.0, 6.0.12.1, 6.0.13.0, 6.0.13.1, 6.0.14.0, 6.0.15.0, 6.0.16.0, 6.0.17.0, 6.0.17.1, 6.0.17.2
5.3
5.3.4.0, 5.3.5.0 5.3.6.0, 5.3.7.0, 5.3.7.1, 5.3.8.0, 5.3.8.1, 5.3.9.0
Non-Conforming Data
Issue
When an algorithm can't mask the sensitive data (due to invalid characters/data or algorithm configuration), the Masking Engine reports this by giving a hint on what is wrong. Since this is sensitive data, the value can't be displayed. Instead, the value is obfuscated using the keys above.
Since this is a critical issue, the action to abort the Masking Job can be defined should non-conforming data be encountered.
Common issues
The issue with non-conforming data is frequently special characters or non-US letters, but can also be related to the data length. The special characters and foreign letters are usually listed as 'P' (punctuations) but can be listed as 'L' (letters).
Example
The example shows two records that masked and three that failed due to different non-conforming issues. The example uses Segment Mapping with 4 characters Alpha-Numeric.
Algorithm: Segment Mapping (4 characters Alpha-Numeric) +--------+--------+-------------+------------------------------------------------+ | Input | Masked | Non-Conform | Comment | +========+========+=============+================================================+ | 1234 | 3424 | | Masked ok | | ABCD | KENB | | Masked ok | +--------+--------+-------------+------------------------------------------------+ | !AB! | !AB! | PLLP | Contains punctuation. | | ÀÄÅB | ÀÄÅB | LLLL | Tricky, contains accented letters. | +--------+--------+-------------+------------------------------------------------+ | ABCD12 | ABCD12 | LLLLNN | This one is too long. | +--------+--------+-------------+------------------------------------------------+
What is Non-Conforming data
The following is from the Masking Engine UI when defining Inventory and setting Actions.
Nonconforming Data Information It is possible that some data in a dataset does not conform to the structure of the chosen algorithm and masking will fail for this data. For example, if you have a segment mapping algorithm that will mask SSNs with the format NNN-NN-NNNN, and an entry is encountered with format NNN-NN-NNNNN, masking of this data will fail. A warning will be displayed on the job monitor indicating Nonconforming data was present in the affected table. You may control whether the presence of nonconforming data causes the masking job to fail using the "Nonconforming Data" selection on the Settings > Algorithms page. This setting may also be controlled individually for each Segment Mapping algorithm. It is also possible to control whether failure is immediate, or reported after the job runs to completion, using the checking the box under "If Nonconforming Data is encountered" in the Create Job screen. The Job Monitor page (Success or Fail) will help you to troubleshoot which data was nonconforming. When representative nonconforming patterns of data are shown, the character pattern is illustrated as follows:
|
Job Execution and Status Configurations
There are two functional configurations:
- How to Mark job status.
- Whether to Stop job execution.
Algorithm Settings
There are two settings - one Global for all Algorithms and one in each applicable Specific Algorithm.
These settings will only set the Job Status. The default settings are marked with '*'.
+--------------------------+---------------------+---------------+ | Algorithm Global Setting | Specific Algorithm | Job marked as | +==========================+=====================+===============+ | n/a | Mark as Failed | Failed | | n/a | Mark as Succeeded | Succeeded | +--------------------------+---------------------+---------------+ | Mark as Failed* | Use global setting* | Failed | | Mark as Succeeded | | Succeeded | +--------------------------+---------------------+---------------+
Job Configurations
The job can be configured to Stop Execution which will stop masking the table which contains non-conforming data. This feature makes sure that unmasked data is not sent to the masked database.
The default setting is marked with '*'. For 'Job marked as' see above.
If Nonconforming Data is encountered
+------------------------------+---------------+-----------------------------------------------------+ | Stop job on first occurrence | Job marked as | Job Status | +==============================+===============+=====================================================+ | Not ticked* | Failed | All rows - job marked as Failed. | | | Succeeded | All rows - job marked as Successful (with warning). | +------------------------------+---------------+-----------------------------------------------------+ | Ticked | Failed | Table with non-conf. terminated. | | | Succeeded | All rows - job marked as Successful (with warning). | +------------------------------+---------------+-----------------------------------------------------+
Classifications and Examples
Below is a complete listing of all classifications and sub-classifications used to categorize the non-conformant data.
Letters (L)
Sub-Classification | Examples - US ASCII | Examples - Unicode and Other |
---|---|---|
Lower Case Letters | a, b, c, d, ..., z | µ, ß, à, æ, ... |
Upper Case Letters | A, B, C, D, ..., Z | À, Æ, Ç, Ň, ... |
Modifier Letters | n/a | ᴬ, ᴭ, ʰ, ʶ, ... |
Titlecase Letters | n/a | Dž, Lj, ᾈ, ... |
Other Letters | n/a | ª, º, ƻ, ج, ش, ഘ, オ, ポ, ... |
Numbers (N)
Sub-Classification | Examples - US ASCII | Examples - Unicode and Other |
---|---|---|
Decimal Numbers | 0, 1, 2, 3, ..., 9 | ٠, ٠, २, ४, ... |
Letter Numbers | n/a | ᛮ, ᛯ, ᛰ, Ⅰ, Ⅱ, Ⅲ, ... |
Other Numbers | n/a | ², ³, ¼, ½, ৴, ৵, ... |
Separators (Z)
Sub-Classification | Examples - US ASCII | Examples - Unicode and Other |
---|---|---|
Space Separators | [space] | , [different size spaces] |
Line Separators | Not visible | Not visable |
Paragraph Separators | Not visible | Not visable |
Punctuation (P)
Sub-Classification | Examples - US ASCII | Examples - Unicode and Other |
---|---|---|
Close Punctuation | ), ], } | ༻, ༽, ᚜, ⁆, ⟧, ... |
Connector Punctuation | _ | ‿, ⁀, ⁔, ︳, ﹍ |
Dash Punctuation | - | -, ⸗, ⸚, 〜, ... |
Final Punctuation | » | ’, ”, ›, ⸃, ... |
Initial Punctuation | « | ‘, ‛, “, ‟, ... |
Open Punctuation | (, [, { | ༺, ༼, ᚛, ⟦, ... |
Other Punctuation | !, ", #, %, &, *, /, :, ... | ՞, ։, ؊, ؟, ๏, ๛, ៘, ... |
Symbols (S)
Sub-Classification | Examples - US ASCII | Examples - Unicode and Other |
---|---|---|
Currency Symbol | $ | ¢, £, ¥, ֏, ... |
Math Symbol | +, <, =, >, |, ~, ... | ϶, ؆, ؇, ⅀, ⅁, ... |
Modifier Symbol | ^, `, ¨, ¯, ... | ꜈, ꜉, ꜊, ꜋ , ꜎, ꜠, ... |
Other Symbol | ¦, ©, ®, ° | ҂, ؎, ؏, ۞, ... |
Marks (M)
Sub-Classification | Examples - US ASCII | Examples - Unicode and Other |
---|---|---|
Enclosing Marks | n/a | ҈, ҉, ᪾, ... |
Nonspacing Marks | n/a | ۖ, ۗ, ۘ , ... |
Spacing Marks | n/a | ः, ऻ, ा, ि, ... |
Troubleshooting
To troubleshoot non-conforming data one has to look at the data. There are no details provided in the bundle or in the logs other than the non-conforming classification shown above.
To understand what is causing the warnings/errors the data has to be queried. Below are troubleshooting tips and queries that can assist in the investigation:
Troubleshooting tips:
How to find Non-Conforming Data on different databases:
- KBA: Profile Data in Database Column for Oracle (KBA4448)
- KBA: Profile Data in Database Column for SQL Server (KBA5051)
- KBA: Profiling Data in Database Column for DB2 (KBA6995)
Reporting and Job Monitoring: