Secure Lookup (SL) is a very commonly used algorithm. When used correctly, the algorithm is fast and lightweight. Out of the box, Delphix Data Platform has approximately 20 predefined secure lookup algorithms.
This document describes how Secure Lookup works and how to create a new one. The Secure Lookup algorithm is a simple algorithm that uses a hashed code of the input to map to a lookup value. These values are imported during the creation of the lookup algorithm. This method ensures that the lookup value returns the same result each time and that the number of lookup values never runs out.
Applicable Delphix Versions
- Click here to view the versions of the Delphix engine to which this article applies
Major Release All Sub Releases 6.0 220.127.116.11, 18.104.22.168, 22.214.171.124, 126.96.36.199
188.8.131.52, 184.108.40.206, 220.127.116.11, 18.104.22.168, 22.214.171.124, 126.96.36.199, 188.8.131.52, 184.108.40.206, 220.127.116.11, 18.104.22.168, 22.214.171.124, 126.96.36.199 188.8.131.52, 184.108.40.206, 220.127.116.11, 18.104.22.168, 22.214.171.124, 126.96.36.199
188.8.131.52, 184.108.40.206, 220.127.116.11, 18.104.22.168, 22.214.171.124, 126.96.36.199, 188.8.131.52, 184.108.40.206
220.127.116.11, 18.104.22.168, 22.214.171.124, 126.96.36.199, 188.8.131.52, 184.108.40.206, 220.127.116.11, 18.104.22.168, 22.214.171.124, 126.96.36.199, 188.8.131.52, 184.108.40.206, 220.127.116.11
18.104.22.168, 22.214.171.124, 126.96.36.199, 188.8.131.52, 184.108.40.206, 220.127.116.11, 18.104.22.168, 22.214.171.124, 126.96.36.199, 188.8.131.52 ,184.108.40.206, 220.127.116.11, 18.104.22.168, 22.214.171.124, 126.96.36.199
188.8.131.52, 184.108.40.206, 220.127.116.11, 18.104.22.168, 22.214.171.124, 126.96.36.199, 188.8.131.52
184.108.40.206, 220.127.116.11, 18.104.22.168, 22.214.171.124, 126.96.36.199, 188.8.131.52, 184.108.40.206, 220.127.116.11 , 18.104.22.168, 22.214.171.124
126.96.36.199, 188.8.131.52, 184.108.40.206, 220.127.116.11, 18.104.22.168, 22.214.171.124, 126.96.36.199, 188.8.131.52
At a Glance
1 Unique Lookup - The loaded lookup values can be duplicated in SL. If a value needs to be represented more frequent - add more records.
3 Due to the randomness in the mapping allocation, there is no way to guarantee that same masked value will not appear even if the number of lookup values exceeds the number of unique values masked. To have 1:1 mapping please use the Mapping Algorithm.
|Output:||The masked output value will be cut to the length of the column (as defined by the data type).|
Works with all character encodings. Encoding should give the same result. Data is trimmed.
|Lookup Pool Size:||Recommended size up to 500,000|
|Limitations:||Large pool size (rows in the lookup file) will take a long time to load into the Transformation
Engine. Please see below for more information.
Creation: Lookup values are uploaded from a text file to the Masking Engine when the algorithm is created.
Modify: The Description and the set of Lookup Values can be changed.
Creating and Modifying Algorithms using the User Interface
The algorithms are accessed from the Settings tab > Algorithm. A custom algorithm can be created or modified.
- To create click Add Algorithm.
- To modify click the edit icon the Edit column.
The following popup is displayed when creating and modifying the algorithm.
- When creating an algorithm, lookup values are loaded from a text file you supply when the algorithm is created. You need to provide details for Algorithm Properties:
- Algorithm Name
- When modifying an algorithm, the Lookup details (description and lookup value) can be modified:
- Lookup File Name
There are a few considerations:
- Pool Size and Load Time
- Memory Requirements
- Case Sensitivity
- Masked Data Encoded with Non-Standard Code Page
Pool Size and Load Time
The pool size (number of rows in the lookup file) affects both the creation time and the load time. An estimate is that the increase in load time is 3 : 10 - increase the number of values to load by 3 times and it will take 10 times longer. On an average system, it will take 4 min to load 500,000 values. Loading 1,500,00 will take 40 min.
The graph below shows the load time (duration) in hh:mm:ss for a Secure Lookup with x number of rows.
The ingested values are loaded into RAM when the masking job starts. For large pool sizes, it might be required to increase the Min/Max Memory settings in the Job Configuration. The size required depends on the lengths of the data in the lookup and the number of lookup values.
The Secure Lookup is case sensitive. This can cause an issue if the same result is expected, independent of the case. There is a workaround below or you can ask for our Technical Services group to create a custom algorithm that masks the database and retains the case. We are currently investigating future enhancements around case sensitivity.
To resolve this issue;
- Open and edit the Rule Set.
- Add a Custom SQL statement with the following amendment:
Change: ..., maskMeCol, ...
..., UPPER(maskMeCol) as maskMeCol, ...
SELECT ID, UPPER(maskMeCol) as maskMeCol from myTable;
Masked Data Encoded with Non-Standard Code Page
The data to be masked might be encoded and contain characters that are not converted correctly to UTF-8. The masked data is masked as "????". This issue is resolved in version 5.1.
The algorithms which have characters outside UTF-8 are ADDRESS LINE SL, ADDRESS LINE2 SL, and US_COUNTIES_SL.
If experiencing this issue please create a case with Delphix support.
Use cases outside the feature scope
There are two use cases frequently requested:
- Retain case
- Mask multiple columns values (Full Name, First Name, and Last Name)
These use cases are not covered with the Out-Of-The-Box Secure Lookup Algorithms. Technical Services can assist in creating a custom algorithm that masks the data based on specific requirements.
We are also currently investigating future enhancements and new algorithms.
The example below displays an example of a masked result.
FIRST NAME SL was used in this example on a column defined with varchar(8).
- Note 1: NULL and white space strings.
- Note 2: case sensitivity.
- Note 3: spaces are trimmed.
- Note 4: 1:N mapping - the masked values 'collide'.
- Note 5: strings are cut (cropped) to fit the column (database masking only).
+-----+--------+----------+ | Ref | Source | Masked | +=====+========+==========+ | 1 | NULL | NULL | << Note 1: NULL and white space strings(' used to indicate space) | 2 | '' | '' | << " | 3 | ' ' | ' ' | << " | 4 | ' ' | ' ' | << " | 5 | Peter | Flo | | 6 | PETER | Yoshiko | << Note 2: Case insensitivity examples. | 7 | peter | Paul | << " | 8 | Peter | Flo | << Note 3: Leading and/or trailing whitespaces are trimmed. | 9 | Bar | Yoshiko | << Note 4: 1:N Mapping - no unique mapping. | 10 | Foo | Christop | << Note 5: Masked value ('Christopher') cropped. +-----+--------+----------+
The following articles may provide more information or related information to this article: