Special Characters in Secure Lookup (KBA5285)
KBA
KBA# 5285Applicable Delphix Versions
This document is relevant to all Masking Engine versions.
At a Glance
Characters: | The affected characters are BOM, TAB, a hyphen, and accents above some letters. |
---|---|
Algorithms: | The affected algorithms are:
Name | Algorithm code | Special chars ---------------------+------------------+------------------- ADDRESS LINE 2 SL | AddrLine2Lookup | BOM ADDRESS LINE SL | AddrLookup | BOM FULL_NM_SL | FullNMLookup | TABs USCITIES_SL | USCitiesLookup | Special chars US_COUNTIES_SL | USCountiesLookup | BOM, special chars |
Terms used: | BOM = Byte Order Mark TAB = tabulator |
Background
When these algorithms were created, some of the values in the source file had special characters in them (some are legitimate and part of the name and others were created by tools or systems used). These characters have been with these algorithms ever since.
Since algorithms cannot be modified, as this will result in a different masked result, these characters will remain in these algorithms.
Most of the time, these characters do not cause any issues.
This KBA documents the algorithms, the values, and give some examples of errors they can cause.
Special Characters
There are some masked values that have characters outside the old ASCII standard. This can be an issue if the database's or file's characters restrict the use of these characters.
Could cause error
The characters in these algorithms could cause the following error - the example below is from Sybase.
Error converting characters into server's character set. Some character(s) could not be converted.
Algorithms and values
Below is a list of algorithms and values:
USCITIES_SL "Winston�Salem" US_COUNTIES_SL "Coös County" "Doña Ana County" "Prince of Wales – Hyder Census Area" "Valdez–Cordova Census Area" "Yukon–Koyukuk Census Area"
Byte Order Mark (BOM)
The BOM character is hidden characters at the start of a file to indicate the encoding type of the file. The BOM character will appear in the first value of some algorithms. It was caused by a BOM in the original imported lookup file.
Could case error
Due to the character, when trimmed it will result in an extra character and if the column length is too short this can result in the value being too long for the column.
Example from Oracle:
ORA-12899: value too large for column
Algorithms and values
Below is a list of algorithms and the column lengths causing an error:
ADDRESS LINE SL - Possible error if column length shorter or equeal to 21 characters. ADDRESS LINE 2 SL - Possible error if column length shorter or equeal to 9 characters. US_COUNTIES_SL - Possible error if column length shorter or equeal to 19 characters.
TAB character
One algorithm has a TAB character between some of the values. This does not usually cause an issue, though will cause an issue if a tab-delimited file is masked, and this algorithm is used.
Could case error
If this algorithm is used on a tab-delimited file, the column alignment will be out of order.
Algorithm
The below algorithm has a tab between some values:
FULL_NM_SL
Resolution
To resolve this issue create a new custom algorithm.
If the original lookup values are required, please create a case with Delphix support.