Best Practice File Masking - Rule Set and Job Configuration (KBA1821)
KBA
KBA#1821This article details the best practices for File Masking using FTP and SFTP.
Applicable Delphix Versions
- Click here to view the versions of the Delphix engine to which this article applies
-
Major Release All Sub Releases 6.0 6.0.0.0, 6.0.1.0, 6.0.1.1, 6.0.2.0, 6.0.2.1, 6.0.3.0, 6.0.3.1, 6.0.4.0, 6.0.4.1, 6.0.4.2, 6.0.5.0, 6.0.6.0, 6.0.6.1, 6.0.7.0, 6.0.8.0, 6.0.8.1, 6.0.9.0, 6.0.10.0, 6.0.10.1, 6.0.11.0, 6.0.12.0, 6.0.12.1 5.3
5.3.0.0, 5.3.0.1, 5.3.0.2, 5.3.0.3, 5.3.1.0, 5.3.1.1, 5.3.1.2, 5.3.2.0, 5.3.3.0, 5.3.3.1, 5.3.4.0, 5.3.5.0, 5.3.6.0, 5.3.7.0, 5.3.7.1, 5.3.8.0, 5.3.8.1, 5.3.9.0 5.2
5.2.2.0, 5.2.2.1, 5.2.3.0, 5.2.4.0, 5.2.5.0, 5.2.5.1, 5.2.6.0, 5.2.6.1
5.1
5.1.0.0, 5.1.1.0, 5.1.2.0, 5.1.3.0, 5.1.4.0, 5.1.5.0, 5.1.5.1, 5.1.6.0, 5.1.7.0, 5.1.8.0, 5.1.8.1, 5.1.9.0, 5.1.10.0
5.0
5.0.1.0, 5.0.1.1, 5.0.2.0, 5.0.2.1, 5.0.2.2, 5.0.2.3, 5.0.3.0, 5.0.3.1, 5.0.4.0, 5.0.4.1, 5.0.5.0, 5.0.5.1, 5.0.5.2, 5.0.5.3, 5.0.5.4
File Masking and Tokenization Recommendations
File Masking and Database Masking process the data in the masking job differently. File Masking jobs are more like On-The-Fly (OTF) masking and there are some key configurations that will improve File Masking jobs.
This article is also applicable to Tokenization jobs. For simplicity, the term Masking will be used from now on.
Creation Process
To create a File Masking job, the best practice is to use On-The-Fly (OTF) masking. This can, however, be tricky to configure and the recommended way to create the masking job is to start with In-Place (IP) and then switch to On-The-Fly. In-Place will see all files and the Rule Set will be created where the masked files are/will be located. The last step is to change the job to point to the source folder and the On-The-Fly job is ready.
- Copy the files to the target (FTP/SFTP) folder.
- Create Rule Set and In-Place masking job.
- When working, create a Source Environment and a Source Connector.
- Change the masking job to On-The-Fly.
This process also helps to run Profiling jobs, which are then run against the Target.
Masking Process
File Masking jobs read all data (all rows and fields). This means that all data will be transferred over the wire.
Files and Patterns
Each file and pattern in the Rule Set has its own Transformation and a connector for Input and one for Output. This means that all files in a Pattern will be processed as a single object (the Output handles the creation of each file).
File Masking jobs read all data (all rows and fields). This means that all data will be transferred over the wire.
The following can be configured in a Rule Set:
- List of individual files
- List of RegEx patterns to mask multiple files
Example
Masking Job Configuration (On-The-Fly)
The two file masking methods, In Place and On-The-Fly, mask data in the same way. The difference is that In Place reads and writes files to the same location and overwrites to the original file while On-The-Fly reads and writes to different locations.
To overwrite the original file, In Place masking reads and writes the masked files twice over the network (this is due to User and Group properties). The file is first read, masked, and written to temp file (*.msk). It is then read and written a second time to overwrite the original file. This almost doubles the time it takes to mask a file compared to On The Fly, which does only reads and writes the data once.
Best Practice
The best practice is to use On-The-Fly.
The reasons are:
- On-The-Fly is faster.
- OTF masking runtime is almost half compared with IP.
- On-The-Fly is much more secure.
- Having separate Masking Environments and (S)FTP Folders for Unmasked Source and Masked Target is much more secure - separating unmasked and masked files.
- On-The-Fly works on more and older versions of (S)FTP server.
How it Works: On-The-Fly
The one rule to remember when configuring On-The-Fly masking is that the Source can Never Ever be masked. Therefore, the masking (Masking Job, Rule Set, Algorithms, etc) is always defined against the Target.
Note that the Target Environment is configured more or less exactly as the In-Place Environment, only the Source Connector is different.
Pro-Tip: It is recommended to name the Environment and the Connector so that they can be identified as being the Source. The Source Environment should only have Connectors.
For On-The-Fly, this is what is needed:
Requirements | Source | Target |
|
|
|
|
|
|
Steps
Step 1 - Create File Format
This step is to create the File Format unless you already have done it.
- Go to Settings and File Formats.
- Click Import File Format.
Step 2 - Create Target Environment with Connector and Rule Set
The steps for the Target are (Note: the best procedure is to start with the target):
- Create a copy of the Files to be masked and copy them to the target folder.
- Create an Environment.
- View environment.
- Create a Connector to the Target.
- Create a Rule Set.
- Open the Inventory and define Masked Columns, alternatively use Profiling.
Step 3 - Create Masking Job (for now as IP) and Test
This step is only to create the job and test it.
- Click on Overview.
- Create the Masking Job (use In-Place).
- Test and make sure files are masked as desired.
Step 4 - Create Source Environment with Connector
The steps for the Source are:
- Create a Source Environment.
- View environment.
- Create a Source Connector to the Source.
Step 5 - Last thing - Change Job to On-The-Fly
Go back to the Target Environment:
- Click on Overview.
- Change the Masking Job to an On-The-Fly and set the Source Environment and Connector.
Windows SFTP Server - OpenSSH
Windows is now supporting SFTP. The OpenSSH client and server are now available as a supported Feature-on-Demand in Windows Server 2019 and Windows 10.
Best Practice
The best practice on Windows is to use the built in OpenSSH Server.
Related Articles
External Links: