Skip to main content
Delphix

Analyze Non-Conforming Data in Masking (KBA5039)

 

 

KBA

KBA# 5039

At a Glance

Description: This page describes how to analyze Non-Conforming data Warnings/Errors. 
Non-Conforming Classification:   

The character Classification used on the Masking Engine is matching Oracle RegEx (code shown): 

Classification regexp US ASCII Unicode 
'L'=Letter [:alpha:] A B C D ... Z All letters
'N'=Number [:digit:] 0 1 2 3 ... 9 Numbers
'Z'=Separators [:space:] [Space] All separators
'P'=Punctuations [:punct:] ( { [ "' . !  etc All punctuations
'S'=Symbols [:punct:] $ * + - /  © etc All symbols
'O'=Other [:cntrl:] ESC BEL etc All control 
'M'=Marks - - All marks
'U'=Unknown - - Other

For details about the Unicode characters please look here (or look below): 
https://www.compart.com/en/unicode/category/

Reporting: Warnings and Errors are reported on:
  • the Monitor page (Table level warnings/errors).
  • on the Logs page on the Admin tab.
Note: Due to bug DLPX-85867 - the Job Monitor might incorrectly show all objects with incorrect status.

This issue is resolved in version 11.0.
Configuration - Status Code: The status code can be set to Fail or Successful should there be Non-Conforming data in a column. This can be  
  • Global Setting on the Settings tab on the Algorithm page.
  • Per algorithm, in each applicable Algorithm popup.
Configuration - Job Execution: To stop masking the table which contains Non-Confirming data. 
  • Job Configuration popup by ticking "Stop job on first occurrence".
Troubleshooting: Troubleshooting tips: 

How to find Non-Conforming Data on different databases:

Applicable Delphix Versions

Click here to view the versions of the Delphix engine to which this article applies
Date Release
Feb 13, 2023 8.0.0.0
Jan 12, 2023 7.0.0.0
Releases Prior to 2023
Major Release All Sub Releases
6.0

6.0.0.0, 6.0.1.0, 6.0.1.1, 6.0.2.0, 6.0.2.1, 6.0.3.0, 6.0.3.1, 6.0.4.0, 6.0.4.1, 6.0.4.2, 6.0.5.0, 6.0.6.0, 6.0.6.1, 6.0.7.0, 6.0.8.0, 6.0.8.1, 6.0.9.0, 6.0.10.0, 6.0.10.1, 6.0.11.0, 6.0.12.0, 6.0.12.1, 6.0.13.0, 6.0.13.1, 6.0.14.0, 6.0.15.0, 6.0.16.0, 6.0.17.0, 6.0.17.1, 6.0.17.2

5.3

5.3.4.0, 5.3.5.0 5.3.6.0, 5.3.7.0, 5.3.7.1, 5.3.8.0, 5.3.8.1, 5.3.9.0

Non-Conforming Data

Issue

When an algorithm can't mask the sensitive data (due to invalid characters/data or algorithm configuration), the Masking Engine reports this by giving a hint on what is wrong. Since this is sensitive data, the value can't be displayed. Instead, the value is obfuscated using the keys above.

Since this is a critical issue, the action to abort the Masking Job can be defined should non-conforming data be encountered. 

Common issues

The issue with non-conforming data is frequently special characters or non-US letters, but can also be related to the data length. The special characters and foreign letters are usually listed as 'P' (punctuations)  but can be listed as 'L' (letters). 

Example

The example shows two records that masked and three that failed due to different non-conforming issues. The example uses Segment Mapping with 4 characters Alpha-Numeric. 

Algorithm: Segment Mapping (4 characters Alpha-Numeric)

+--------+--------+-------------+------------------------------------------------+
| Input  | Masked | Non-Conform | Comment                                        |
+========+========+=============+================================================+
| 1234   | 3424   |             | Masked ok                                      |
| ABCD   | KENB   |             | Masked ok                                      |
+--------+--------+-------------+------------------------------------------------+
| !AB!   | !AB!   | PLLP        | Contains punctuation.                          |
| ÀÄÅB   | ÀÄÅB   | LLLL        | Tricky, contains accented letters.             |
+--------+--------+-------------+------------------------------------------------+
| ABCD12 | ABCD12 | LLLLNN      | This one is too long.                          |
+--------+--------+-------------+------------------------------------------------+

What is Non-Conforming data

The following is from the Masking Engine UI when defining Inventory and setting Actions. 

Nonconforming Data Information

It is possible that some data in a dataset does not conform to the structure of the chosen algorithm and masking will fail for this data.

For example, if you have a segment mapping algorithm that will mask SSNs with the format NNN-NN-NNNN, and an entry is encountered with format NNN-NN-NNNNN, masking of this data will fail. A warning will be displayed on the job monitor indicating Nonconforming data was present in the affected table.

You may control whether the presence of nonconforming data causes the masking job to fail using the "Nonconforming Data" selection on the Settings > Algorithms page. This setting may also be controlled individually for each Segment Mapping algorithm.

It is also possible to control whether failure is immediate, or reported after the job runs to completion, using the checking the box under "If Nonconforming Data is encountered" in the Create Job screen.

The Job Monitor page (Success or Fail) will help you to troubleshoot which data was nonconforming. When representative nonconforming patterns of data are shown, the character pattern is illustrated as follows:

  • N for digits
  • L for letters
  • M for marks
  • P for punctuation
  • S for symbols
  • Z for separator
  • O for other
  • U for unknown

Job Execution and Status Configurations 

There are two functional configurations:

  • How to Mark job status.
  • Whether to Stop job execution.

Algorithm Settings

There are two settings - one Global for all Algorithms and one in each applicable Specific Algorithm.

These settings will only set the Job Status. The default settings are marked with '*'.

+--------------------------+---------------------+---------------+
| Algorithm Global Setting | Specific Algorithm  | Job marked as |
+==========================+=====================+===============+
| n/a                      | Mark as Failed      | Failed        |
| n/a                      | Mark as Succeeded   | Succeeded     |
+--------------------------+---------------------+---------------+
| Mark as Failed*          | Use global setting* | Failed        |
| Mark as Succeeded        |                     | Succeeded     |
+--------------------------+---------------------+---------------+

Job Configurations

The job can be configured to Stop Execution which will stop masking the table which contains non-conforming data. This feature makes sure that unmasked data is not sent to the masked database.

 

Note

Note:

There are situations where it is ok to have unmasked data and in this case, the job should succeed even if there is non-conforming data (for example codes in dates fields or phone numbers). Though, it is better to make sure the data and algorithms conform. 

The masking job will only stop masking the table with non-confirming data that failed. 
When non-conforming data is identified the masking will terminate at the current location. No unmasked data will be written to the database or the file.

DLPX-69282 has been raised to change the wording or the behavior of this. 

 

The default setting is marked with '*'. For 'Job marked as' see above.

If Nonconforming Data is encountered 
+------------------------------+---------------+-----------------------------------------------------+
| Stop job on first occurrence | Job marked as | Job Status                                          |
+==============================+===============+=====================================================+
| Not ticked*                  | Failed        | All rows - job marked as Failed.                    |
|                              | Succeeded     | All rows - job marked as Successful (with warning). |
+------------------------------+---------------+-----------------------------------------------------+
| Ticked                       | Failed        | Table with non-conf. terminated.                    |
|                              | Succeeded     | All rows - job marked as Successful (with warning). |
+------------------------------+---------------+-----------------------------------------------------+

Classifications and Examples

Below is a complete listing of all classifications and sub-classifications used to categorize the non-conformant data. 

Letters (L)

Sub-Classification Examples - US ASCII Examples - Unicode and Other
Lower Case Letters a, b, c, d, ..., z µ, ß, à, æ, ...
Upper Case Letters A, B, C, D, ..., Z À, Æ, Ç, Ň, ...
Modifier Letters n/a ᴬ, ᴭ, ʰ, ʶ, ...
Titlecase Letters n/a Dž, Lj, ᾈ, ...
Other Letters n/a ª, º, ƻ, ج, ش, ഘ, オ, ポ, ...

Numbers (N)

Sub-Classification Examples - US ASCII Examples - Unicode and Other
Decimal Numbers 0, 1, 2, 3, ..., 9 ٠, ٠, २, ४, ... 
Letter Numbers n/a ᛮ, ᛯ, ᛰ, Ⅰ, Ⅱ, Ⅲ, ...
Other Numbers n/a ², ³, ¼, ½, ৴, ৵, ...

Separators (Z)

Sub-Classification Examples - US ASCII  Examples - Unicode and Other 
Space Separators [space]  , [different size spaces]
Line Separators Not visible Not visable
Paragraph Separators Not visible Not visable

Punctuation (P)

Sub-Classification Examples - US ASCII  Examples - Unicode and Other 
Close Punctuation ), ], } ༻, ༽,  ᚜, ⁆, ⟧, ...
Connector Punctuation _ ‿, ⁀, ⁔, ︳, ﹍
Dash Punctuation - -, ⸗, ⸚, 〜, ... 
Final Punctuation » ’, ”, ›, ⸃, ...
Initial Punctuation « ‘, ‛, “, ‟, ...
Open Punctuation (, [, { ༺, ༼, ᚛, ⟦, ...
Other Punctuation !, ", #, %, &, *, /, :, ... ՞, ։, ؊, ؟, ๏, ๛, ៘, ...

Symbols (S)

Sub-Classification Examples - US ASCII  Examples - Unicode and Other 
Currency Symbol $ ¢, £, ¥, ֏, ...
Math Symbol +, <, =, >, |, ~, ... ϶, ؆, ؇, ⅀, ⅁, ...
Modifier Symbol ^, `, ¨, ¯, ... ꜈, ꜉, ꜊, ꜋ , ꜎, ꜠, ...
Other Symbol ¦, ©, ®, ° ҂, ؎, ؏, ۞, ...

Marks (M)

Sub-Classification Examples - US ASCII  Examples - Unicode and Other 
Enclosing Marks n/a ҈, ҉, ᪾, ... 
Nonspacing Marks n/a ۖ, ۗ, ۘ  , ...
Spacing Marks n/a ः, ऻ, ा, ि, ...

Other (O)

Sub-Classification Examples - US ASCII  Examples - Unicode and Other
Control NULL, ACK, BEL, ESC, ... None
Format SHY LRO, RLO, RLI, LRI, BOM, ...
Private Use None None
Surrogate None None

 

Troubleshooting

To troubleshoot non-conforming data one has to look at the data. There are no details provided in the bundle or in the logs other than the non-conforming classification shown above.

To understand what is causing the warnings/errors the data has to be queried. Below are troubleshooting tips and queries that can assist in the investigation:


Troubleshooting tips: 


How to find Non-Conforming Data on different databases:


Reporting and Job Monitoring: