Skip to main content
Delphix

Special Characters in Secure Lookup (KBA5285)

 

KBA

KBA# 5285

Applicable Delphix Versions

This document is relevant to all Masking Engine versions.

At a Glance 

Characters: The affected characters are BOM, TAB, a hyphen, and accents above some letters. 
Algorithms: The affected algorithms are: 
 Name                | Algorithm code   | Special chars
---------------------+------------------+-------------------
 ADDRESS LINE 2 SL   | AddrLine2Lookup  | BOM
 ADDRESS LINE SL     | AddrLookup       | BOM
 FULL_NM_SL          | FullNMLookup     | TABs
 USCITIES_SL         | USCitiesLookup   | Special chars
 US_COUNTIES_SL      | USCountiesLookup | BOM, special chars
Terms used: BOMByte Order Mark
TAB = tabulator

Background

When these algorithms were created, some of the values in the source file had special characters in them (some are legitimate and part of the name and others were created by tools or systems used). These characters have been with these algorithms ever since. 

Since algorithms cannot be modified, as this will result in a different masked result, these characters will remain in these algorithms. 

Most of the time, these characters do not cause any issues. 

This KBA documents the algorithms, the values, and give some examples of errors they can cause. 

Special Characters

There are some masked values that have characters outside the old ASCII standard. This can be an issue if the database's or file's characters restrict the use of these characters.

Could cause error

The characters in these algorithms could cause the following error - the example below is from Sybase. 

Error converting characters into server's character set. Some character(s) could not be converted.

Algorithms and values   

Below is a list of algorithms and values: 

USCITIES_SL
"Winston�Salem"

US_COUNTIES_SL
"Coös County"
"Doña Ana County"
"Prince of Wales – Hyder Census Area"
"Valdez–Cordova Census Area"
"Yukon–Koyukuk Census Area"

Byte Order Mark (BOM) 

The BOM character is hidden characters at the start of a file to indicate the encoding type of the file. The BOM character will appear in the first value of some algorithms. It was caused by a BOM in the original imported lookup file.

Could case error 

Due to the character, when trimmed it will result in an extra character and if the column length is too short this can result in the value being too long for the column.

Example from Oracle: 

ORA-12899: value too large for column

Algorithms and values   

Below is a list of algorithms and the column lengths causing an error: 

ADDRESS LINE SL
 - Possible error if column length shorter or equeal to 21 characters. 

ADDRESS LINE 2 SL
 - Possible error if column length shorter or equeal to 9 characters.

US_COUNTIES_SL
 - Possible error if column length shorter or equeal to 19 characters.

TAB character 

One algorithm has a TAB character between some of the values. This does not usually cause an issue, though will cause an issue if a tab-delimited file is masked, and this algorithm is used. 

Could case error 

If this algorithm is used on a tab-delimited file, the column alignment will be out of order.

Algorithm

The below algorithm has a tab between some values:

FULL_NM_SL

Resolution 

To resolve this issue create a new custom algorithm.

If the original lookup values are required, please create a case with Delphix support.