Skip to main content
Delphix

Algorithm: Secure Lookup (SL) (KBA1243)

 

Secure Lookup (SL) is a very commonly used algorithm. When used correctly, the algorithm is fast and lightweight. Out of the box, Delphix Data Platform has approximately 20 predefined secure lookup algorithms. 

This document describes how Secure Lookup works and how to create a new one.  The Secure Lookup algorithm is a simple algorithm which uses a hashed code of the input to map a lookup value. These values are imported in the creation of the lookup algorithm. This method ensures that the lookup value returns the same result each time and that the number of lookup values never runs out.

 

Warning

Warning:

Since the values in the output is likely to repeat, this algorithm is not suited for indexes and when the values need to be unique.

 

Note

Note:

It is recommended to limit the number of items in the Secure Lookup to under 500,000.  

Applicable Delphix Versions

This article applies to the following versions of the Delphix Engine:

At a Glance

Versions: Applicable Delphix Masking versions: 4.x, 5.0.x, 5.1.x, 5.2.x, 5.3.x
Characteristics:
 
Type
Unique
Lookup1
Referential
Integrity2

1:1 Mapping3

Comment
SL General Lookup No Yes  No3  
NULL-SL Lookup No Yes No Masks everything to NULL.

Unique Lookup - The loaded lookup values can be duplicated in SL. If a value needs to be represented more frequent - add more records. 
Referential Integrity - The masked value will be the same between tables, databases, and jobs.
1:1 Mapping - The masked value will be mapped uniquely to the input value.

3 Due to the randomness in the mapping allocation, there is no way to guarantee that same masked value will not appear even if the number of lookup values exceeds the number of unique values masked. To have 1:1 mapping please use the Mapping Algorithm. 

Character 
Encodings:

Works with all character encodings. Encoding should give the same result. Data is trimmed. 
SL is Case Sensitive (Anders and ANDERS are different). See "Case Sensitivity" section below for the workaround.  

Lookup Pool Size: Recommended size up to 500,000
Limitations: Large pool size (rows in the lookup file) will take a long time to load into the Transformation 
Engine. Please see below for more information.
Customization:

Creation: Lookup values are loaded when the algorithm is created from a text file.

Modify: The Description and the Lookup Value cab be changed. 

Creating and Modifying Algorithms using the User Interface

The algorithms are accessed from the Settings tab >  Algorithm. A custom algorithm can be created or modified.

  • To create click Add Algorithm.
  • To modify click the edit icon the Edit column.

User Interface

The following popup is displayed when creating and modifying the algorithm.

image.png

  1. When creating an algorithm, lookup values are loaded when the algorithm is created from a text file. You need to provide details for Algorithm Properties:
    • Algorithm Name
    • Description
  2. When modifying an algorithm, the Lookup details (description and lookup value) can be modified:
    • Lookup File Name

 

Note

Note:

By default, editable parameters are Description and List of Lookup Values only. To modify any other parameter, the algorithm needs to be deleted and recreated.

Considerations

There are a few considerations:

  • Pool Size and Load Time 
  • Memory Requirements
  • Case Sensitivity
  • Masked Data Encoded with Non-Standard Code Page

More details below. 

Pool Size and Load Time 

The pool size (number of rows in the lookup file) affects both the creation time and the load time. An estimate is that the increase in load time is 10 to 3 - increase the number of values to load by 3 times and it will take 10 times longer. On an average system, it will take 4 min to load 500,000 values. Loading 1,500,00 will take 40 min. 

The graph below shows the load time (duration) in hh:mm:ss for a Secure Lookup with x number of rows.  

Memory Requirements  

The ingested values are loaded into RAM when the masking job starts. For large pool sizes, it might be required to increase the Min/Max Memory settings in the Job Configuration. The size required depends on the lengths of the data in the lookup and number of lookup values. 

Best Practice: Job Configuration on the Masking Engine

Case Sensitivity   

The Secure Lookup is case sensitive. This can cause an issue if the same result is expected, independent of the case.

Workaround:

To resolve this issue;

  1. Open and edit the Rule Set.
  2. Add a Custom SQL statement with the following amendment:

Change: ..., maskMeCol, ...

 to: ..., UPPER(maskMeCol) as maskMeCol, ...

Example:
SELECT ID, UPPER(maskMeCol) as maskMeCol from myTable;

Masked Data Encoded with Non-Standard Code Page  

The data to be masked might be encoded and contain characters that are not converted correctly to UTF-8. The masked data is masked as "????". This issue is resolved in version 5.1.

If experiencing this issue please create a case with Delphix support. 

Examples  

The example below displays an example for masked result. Note the case sensitivity. 

FIRST NAME SL was used in this example.

+-----+--------+----------+
| Ref | Source | Masked   |
+=====+========+==========+
| 1   | Peter  | Flo      |
+-----+--------+----------+
| 2   | PETER  | Yoshiko  |  << Case insensitivity examples.
+-----+--------+----------+
| 3   | peter  | Paul     |  <<   "
+-----+--------+----------+
| 4   |  Peter | Flo      |  << Leading and/or trailing whitespaces are trimmed.
+-----+--------+----------+     Result same as Ref 1.
...
| 5   | Bar    | Yoshiko  |  << 1:N Mapping - no unique mapping.
+-----+--------+----------+

 

Related Articles

The following articles may provide more information or related information to this article: