Skip to main content
Delphix

Analyze Non-Conforming Data in Masking (KBA5039)

 

KBA

KBA# 5039

 

At a Glance

Versions: Applicable Delphix Masking versions: 5.3 (from 5.3.4), 6.x
Description: This page describes how to analyze Non-Conforming data Warnings/Errors. 
Non-Conforming Classification:   

The Classification used on the Masking Engine is: 

Classification Code Characters 
Letter 'L' All Unicode letters.
Number 'N' All Unicode numbers. 
Marks 'M' All Unicode marks.
Separators 'Z' All Unicode separators.
Punctuations 'P' All Unicode punctuations.
Symbols 'S' All Unicode symbols.
Other 'O'  

For details about the Unicode characters please look here (or look below): 
https://www.compart.com/en/unicode/category/

Applies to: The Non-Conforming Warnings and Errors applies to: 
  • Segment Mapping algorithms.
  • DateShift algorithms. 
  • PHONE SL algorithm.
  • ZIP+4 algorithm.
Reporting: Warnings and Errors are reported on:
  • the Monitor page (Table level warnings/errors).
  • on the Logs page on the Admin tab.
Configuration: The action triggered when the Non-Conformant data is encountered are defined: 
  • Job Configuration popup.
  • Settings tab on the Algorithm page.
  • On the Segment Mapping popup.

Issue

When the algorithm used isn't matching the data to be masked, the Masking Engine is now reporting this and the engine is also trying to give hints on what is wrong.

Furthermore, since this is a critical issue the action to Abort the Masking Job can be defined should Non-Conforming data be encountered. 

When using algorithms that are defined on specific characters and that data is not matching it is then essential that this is reported as the data will not be masked.

The issue with Non-Conforming data is frequently Special Characters or Foreign (ready non-US) Letters but can also be related to the data length.

The Special Characters and Foreign Letters are usually listed as 'P' but can be listed as 'L'. 

Example

The example shows two records that masked ok and three that failed due to different Non-Conforming issues. The example uses Segment Mapping with 4 characters Alpha-Numeric. 

+--------+--------+-------------+--------------------------------------------------+
| Input  | Masked | Non-Conform | Comment                                          |
+--------+--------+-------------+--------------------------------------------------+
| 1234   | 3424   |             | Masked ok                                        |
| ABCD   | KENB   |             | Masked ok                                        |
+--------+--------+-------------+--------------------------------------------------+
| !AB!   | !AB!   | PLLP        | Failed - punctuations.                           |
| ÀÄÅB   | ÀÄÅB   | LLLL        | Failed - tricky as it includes foreign letters.  |
+--------+--------+-------------+--------------------------------------------------+
| ABCD12 | ABCD12 | LLLLNN      | Failed - too long.                               |
+--------+--------+-------------+--------------------------------------------------+

What is Non-Conforming data - from Masking Engine UI

The following is from the Masking Engine UI when defining Inventory and setting Actions. 

Nonconforming Data Information

It is possible that some data in a dataset does not conform to the structure of the chosen algorithm and masking will fail for this data.

For example, if you have a segment mapping algorithm that will mask SSNs with the format NNN-NN-NNNN, and an entry is encountered with format NNN-NN-NNNNN, masking of this data will fail. A warning will be displayed on the job monitor indicating Nonconforming data was present in the affected table.

You may control whether the presence of nonconforming data causes the masking job to fail using the "Nonconforming Data" selection on the Settings > Algorithms page. This setting may also be controlled individually for each Segment Mapping algorithm.

It is also possible to control whether failure is immediate, or reported after the job runs to completion, using the checking the box under "If Nonconforming Data is encountered" in the Create Job screen.

The Job Monitor page (Success or Fail) will help you to troubleshoot which data was nonconforming. When representative nonconforming patterns of data are shown, the character pattern is illustrated as follows:

  • N for digits
  • L for letters
  • M for marks
  • P for punctuation
  • S for symbols
  • Z for separator
  • O for other
  • U for unknown

Classifications and Examples

Below is a complete listing of all Classifications and Sub-Classifications used to categorize the Non-Conformant data. 

Letters (L)

Sub-Classification US ASCII Example Unicode and Other Examples
Lower Case Letters a, b, c, d, ..., z µ, ß, à, æ, ...
Upper Case Letters A, B, C, D, ..., Z À, Æ, Ç, Ň, ...
Modifier Letters None ᴬ, ᴭ, ʰ, ʶ, ...
Titlecase Letters None Dž, Lj, ᾈ, ...
Other Letters None ª, º, ƻ, ج, ش, ഘ, オ, ポ, ...

Numbers (N)

Sub-Classification US ASCII Example Unicode and Other Examples
Decimal Numbers 0, 1, 2, 3, ..., 9 ٠, ٠, २, ४, ... 
Letter Numbers None ᛮ, ᛯ, ᛰ, Ⅰ, Ⅱ, Ⅲ, ...
Other Numbers None ², ³, ¼, ½, ৴, ৵, ...

Marks (M)

Sub-Classification US ASCII Example Unicode and Other Examples
Enclosing Marks None ҈, ҉, ᪾, ... 
Nonspacing Marks None ۖ, ۗ, ۘ  , ...
Spacing Marks None ः, ऻ, ा, ि, ...

Separators (Z)

Sub-Classification US ASCII Example Unicode and Other Examples
Space Separators [space]  , [different size spaces]
Line Separators Not visible None
Paragraph Separators Not visible None

Punctuation (P)

Sub-Classification US ASCII Example Unicode and Other Examples
Close Punctuation ), ], } ༻, ༽,  ᚜, ⁆, ⟧, ...
Connector Punctuation _ ‿, ⁀, ⁔, ︳, ﹍
Dash Punctuation - -, ⸗, ⸚, 〜, ... 
Final Punctuation » ’, ”, ›, ⸃, ...
Initial Punctuation « ‘, ‛, “, ‟, ...
Open Punctuation (, [, { ༺, ༼, ᚛, ⟦, ...
Other Punctuation !, ", #, %, &, *, /, :, ... ՞, ։, ؊, ؟, ๏, ๛, ៘, ...

Symbols (S)

Sub-Classification US ASCII Example Unicode and Other Examples
Currency Symbol $ ¢, £, ¥, ֏, ...
Math Symbol +, <, =, >, |, ~, ... ϶, ؆, ؇, ⅀, ⅁, ...
Modifier Symbol ^, `, ¨, ¯, ... ꜈, ꜉, ꜊, ꜋ , ꜎, ꜠, ...
Other Symbol ¦, ©, ®, ° ҂, ؎, ؏, ۞, ...

Other (O)

Sub-Classification US ASCII Example Unicode and Other Examples
Control NULL, ACK, BEL, ESC, ... None
Format SHY LRO, RLO, RLI, LRI, BOM, ...
Private Use None None
Surrogate None None

 

Troubleshooting

To troubleshoot non-conforming data one has to look at the data. There are no details provided in the bundle or in the logs other than the Non-Conforming Classification shown above.

To understand what is causing the warnings/errors the data has to be queried. Examples of queries that can assist investigation can be found here: