Skip to main content
Delphix

Using Custom SQL (KBA1764)

 

 

KBA

KBA#
1764

This article describes Custom SQL which is a Rule Set customization feature. You will learn how to use the feature, casting of values, pitfalls, and troubleshooting. There are some customizations that are related to the Custom SQL which are also highlighted, for example Logical Key.

Note

Note:

This is an advanced feature that requires a good knowledge of SQL statements, the structure of the data and the table to be masked, and how masking is performed.

It is recommended to first start creating a normal masking job and then if needed tweak the SQL in Custom SQL.

Applicable Delphix Versions 

Click here to view the versions of the Delphix engine to which this article applies
Date Release
Mar 20, 2024 | Apr 2, 2024 21.0.0.0 | 21.0.0.1
Feb 21, 2024 20.0.0.0
Jan 25, 2024 19.0.0.0
Dec 20, 2023 | Jan 10, 2024 18.0.0.0 | 18.0.0.1
Nov 21, 2023 17.0.0.0
Oct 18, 2023 16.0.0.0
Sep 21, 2023 15.0.0.0
Aug 24, 2023 14.0.0.0
Jul 24, 2023 13.0.0.0
Jun 21, 2023 12.0.0.0
May 25, 2023 11.0.0.0
Apr 13, 2023 10.0.0.0 | 10.0.0.1
Mar 13, 2023 | Mar 20, 2023 9.0.0.0 | 9.0.0.1
Feb 13, 2023 8.0.0.0
Jan 12, 2023 7.0.0.0
Releases Prior to 2023
Major Release All Sub Releases
6.0

6.0.7.0, 6.0.8.0, 6.0.8.1, 6.0.9.0, 6.0.10.0, 6.0.10.1, 6.0.11.0, 6.0.12.0, 6.0.12.1, 6.0.13.0, 6.0.13.1, 6.0.14.0, 6.0.15.0, 6.0.16.0, 6.0.17.0, 6.0.17.1, 6.0.17.2

At a Glance  

Description:

The Rule Set is the glue for the Masking Engine. The Rule Set provides you with:

  • Connector Information 
  • Inventory of Masking Rules
  • Masking and Profiling Job
  • Tables and Customization Rules 

Why Use Custom SQL? 

The Custom SQL is a great tool which enables you to change the data that goes into the Transformation Engine (or in other words 'what and how it is being masked'). With Custom SQL we can control and change:

  • The ID field(s) (or Logical Key).
  • The Masked Fields.
  • Casting values.
  • Reformatting of values.
  • Replacing Values.
  • Renaming Fields.
  • Filter with where clauses and in select statements.
  • Creating table joins.

User Interface 

Custom SQL is accessed from the Rule Set Menu.

To modify the Rule Set:

  1. Click on the Rule Set menu.
  2. Click Edit or the link on the Rule Set being edited.
    1_Edit Ruleset.png
  3. Click Next on the Details page.
  4. On Data Tables Page, select the Table you want to edit.
    2_DataTables.png
  5. Select Edit Custom SQL.
    3_editCustomSQL.png
  6. Click Save.

Custom SQL Indication 

When a Custom SQL has been specified for a Table, this is indicated in the user interface with the word "SQL" as shown above.

Editing Custom SQL  

When selecting Custom SQL in the popup, a SQL statement is presented.
To assist with all column names, the SQL contains all field names in the table and some cast definition if the data type needs to be converted in order to be masked (for example bit for boolean which is cast to int). 

 

tip

Tip:

Edit the SQL in MS Access or any other database or database design tool.

Create a connection to your database, paste in the SQL and edit and test it. When the data extracted is meeting your masking needs, paste it back into the Custom SQL box. 

Required Fields 

There are some essential fields which are required in order to get the masking job to work. This differs significantly between In-Place and On-The-Fly masking jobs.

In-Place 

With In-Place masking, only the Masked Columns and the Unique ID are needed to be extracted and ingested into the masking transformation process. 

In a normal masking job, the selection of fields is automatic and only required fields are included. 

select UniqueID, MaskedCol from MaskedTable

Masking - Custom SQL - In-Place.png

Required Columns
  1. Unique Identifier for the table.
    This should be either an
    • unmasked unique field in the table
    • unique identifier if added by the masking process. 
  2. Masked Columns with the correct column names.
  3. Any Other Column specified will be ingested but not used.

 

Note

Note:

 Remove Columns that are not used.

It is highly recommended to remove all columns that are not needed to mask the data as all columns added are loaded into memory and will use job memory.
This will also slightly slow down the masking job. There are many cases where large objects being part of the columns have generated Out of Memory errors.

On-The-Fly 

On-The-Fly masking is different as all data is extracted from one table and inserted into another. Therefore, all data to be inserted back is needed to be specified in the Custom SQL.

select MaskedCol, AllOtherCol from MaskedTable

Masking - Custom SQL - On-The-Fly ver 3.png

Required Columns
  1. Masked Columns with the correct column names. 
  2. All Other Columns that are needed in the new table (which is usually all columns). 

Since all columns are generally needed in On-The-Fly masking, this can mean that these Custom SQL can be quite complex. It might also require more memory to hold all columns in memory.

 

Note

Note:

 No ID column(s) are needed in On-The-Fly masking. This is because records are inserted instead of updated.

Unique ID for In-Place Masking 

The Unique ID is Primary Key or Index 

In this case, the Unique ID only needs to be specified in the Custom SQL statement. 

Specify the Unique ID column in the:

  • Custom SQL statement. 
There is a Unique ID but it is Not a Primary Key or Index 

In this case, the Masking Engine does not know which column is the index so you need to specify a Logical Key.

Specify the Unique ID column in the:

  • Logical Field in the table Rule Set settings. 
  • Custom SQL statement. 
If there is no Unique Identifier or if the Unique Identifier is masked 

If a Unique Key is missing or masked then the following Unique Key is needed in the Custom SQL statement:

  • Identify database type and determine which Row Id is used/created.
  • Specify the Unique ID column in the Custom SQL statement. 

Oracle - ROWID 

If the database is Oracle, use the ROWID key as identified. Its a good practice to use this all the time for Oracle.

select ROWID, MaskedCol from MaskedTable
MS SQL Server - DMS_ROW_ID 

On MS SQL Server, if there is no Unique Key a key will be temporarily added by the Masking Engine called 'DMS_ROW_ID'. If the Custom SQL is used this needs to be added in the Custom SQL statement even if the Masking Engine is adding it to the Table. 

select DMS_ROW_ID, MaskedCol from MaskedTable

Advanced Custom SQL Constructs  

 

Note

Note:

 Things to remember:

  • Provide a name for the column back to the masked column name
    For example add "AS MaskedCol".
  • Column and Tables names can be case sensitive, especially when they are quoted (with ” or ’). Always test the SQL statement first.
  • Functions and syntax differ between databases.
    Check for correct syntax for your database.
  • Test the SQL on a database design tool. 

Selecting Values and Formats 

Set the masked value or string to a constant 

With Custom SQL it is possible to set the masked value to a constant value. This can be very useful if the masked column is a BLOB or a long text that needs to be masked to a specific string. As there is no need to send all data into the Masking Transformation - we can change the BLOB or the TEXT to a short string or value. 

All values in the column to a string constant  
... 'string' AS MaskedCol ...

Example: 

select UniqueID,'Redacted text object' AS LeagalOutcome from MaskedTable
select UniqueID, 'xxxx-xxxx-xxxx-xxxx' AS CreditCard from MaskedTable
All values in the column to a number 
... value AS MaskedCol ...

Example:

select UniqueID, 1 AS Salary from MaskedTable

Set the masked value to a random value 

It is possible to set each row to a different value - for example a 'random' value.

MS SQL Server Example - generating a random value between 20 and 85:

select ID, ABS(CHECKSUM(NewId())) % 65 + 20 AS Age FROM MaskedTable

 

Note

Note:

 A better Algorithm is Secure Lookup
It might be better to add the values 20 to 85 in a text file and use Secure Lookup as this will handle NULL and spaces as well. The example above is used to highlight what is possible and it is possible to use this in conjunction with an algorithm. 

Cast and Convert the masked value or string 

Cast - number 
  • Cast value to a 4 digit decimal number. 
  • For example 9.5 to 9.5000.
... CAST(MaskedCol AS decimal(6,4)) AS MaskedCol ...
Convert - number 
  • Same as above but with CONVERT.
  • Converts a number to 4 digit decimal number.
... CONVERT(decimal(MaskedCol), 9.5) AS decimal AS MaskedCol ...
Parse a string to money 

This example uses parse to convert a string to a value of type money - for example '($1,000.00)'. 

... PARSE(MaskedCol AS MONEY) AS MaskedCol ...
Convert - character Set 

This is shown as an example. It will not resolve character conversion errors. It might though be able to handle some conversions prior to performing masking. 

... CONVERT(MaskedCol, 'US7ASCII', 'WE8ISO8859P1') AS MaskedCol ...

Case - IIF - Then - Else 

This is an useful building block in Custom SQL. The implementation varies slightly between different databases. One specific case is to filter out NULL values or other specific values before masking. 

Case  - When - Then - Else 

This can be nested and multiple WHEN clauses can be specified after each other. 

Syntax:

CASE
   WHEN when_expression THEN result_expression [ ...n ]         
   [ ELSE else_result_expression ]
END

Example:

select 
   UniqueID, 
   CASE WHEN MaskedCol is NULL THEN 1000 ELSE MaskedCol END as MaskedCol 
from 
   MaskedTable
IIF  

IIF is a shorter one-liner similar to the Excel function.

MS SQL Server -Syntax:

... IIF ( boolean_expression, true_value, false_value ) AS MaskedCol ...

Example:

select UniqueID, IIF(MaskedCol is NULL, '', MaskedCol) AS MaskedCol from MaskedTable  
COALESCE 

A much better way for these examples though is to use COALESCE. It returns the 'first non-null value in a list of expressions'.  

Syntax:

... COALESCE (expr1, expr2) ...

Example:

select UniqueID, COALESCE (MaskedCol, 'Change Null to this') AS MaskedCol from MaskedTable

String manipulation 

Change string to Upper/Lower 

This is a useful function as the masking engine is case sensitive and Secure Lookup will mask 'Peter' and 'PETER' to different values. 

... UPPER(MaskedCol) AS MaskedCol ...
... LOWER(MaskedCol) AS MaskedCol ...

Example:

select UniqueID, UPPER(FirstName) as FirstName from MaskedTable
Concatenate Strings and using Right, Left, SubString 

A specific use case for this is if the data in the column is not adhering to a pattern (for example values 0001 is stored as 1 in the database). It can also be that two values in the database need to be concatenated before masking. 

First the syntax - CONCAT:

... CONCAT ( string_value1, string_value2 [, string_valueN ] ) ...

Syntax - RIGHT, LEFT, SUBSTRING

... RIGHT ( character_expression , integer_expression ) ...
... LEFT ( character_expression , integer_expression ) ...
... SUBSTRING ( expression ,start , length )  ...

How it would be used with MaskedCol:

... CONCAT(MaskedCol, ' added string') AS MaskedCol ...

Examples:

select UniqueID, CONCAT(FirstName, " ", LastName) as FullName from MaskedTable
select 
   UniqueID, 
   RIGHT(CONCAT('0000', CAST( MaskedCol as VARCHAR(10))),4) as MaskedCol
from MaskedTable 
Trimming white spaces 

The last and very useful functions are LTRIM and RTRIM. These will remove any leading or trailing white spaces - which can cause different masking results. 

 

Note

Note:

 All columns of datatype CHAR have trailing white spaces.
A string stored in a column with CHAR and with VARCHAR is different. CHAR is of fixed length and will have trailing spaces while VARCHAR  is of variable length and will usually have no spaces after the string.

For example CHAR(10): "PETER     " and VARCHAR(10): "PETER" 

... LTRIM(MaskedCol) AS MaskedCol ... 
... RTRIM(MaskedCol) AS MaskedCol ...

Example:

These can be added together like in this example.

select ID, LTRIM(RTRIM( MaskedCol )) AS MaskedCol FROM MaskedTable

Where Predicates  

The Custom SQL can also include Where clauses in order to:

  • Select specific records, for example, Personal Accounts vs Company Accounts
  • Select data per Partition Table

Note that any column can be used in the where predicate.

Note

Note:

 The negative Predicate is unmasked.
A warning - the unmatched rows in the where predicate will not be masked. Multiple jobs are needed to mask a complete dataset. 

... WHERE AnyCol = expr;

Example: 

select UniqueId, MaskedCol WHERE AnyCol = 'PERSONAL'