Description

The DelimitedFileMaskReceiver will mask targeted values within a given Delimited file using Synthetic Data Masking (SDM). 


Notes: 

  • This Receiver only works with flat delimited files.
  • It will mask the values within the source file with synthetic data in a new delimited file.
  • If the Domain loopCount does not match the number of records in the source file, the data will not mask correctly.


In This Article


When should this Receiver be used?

  • Any time values within a flat delimited file must be masked with synthetic data values.


When should this Receiver not be used?

  • Any time values must be masked in other file types (e.g., JSON, ORC).


How to Mask a Delimited File 

Note: A use case is provided later in this article.

  1. Select a Project and Project Version.
  2. Create a Domain and add Attributes for the values that will be masked.
  3. Remove the id Attribute from the Domain if it is not being used.
  4. Change the Domain's loopCount to match the number of records in the source file. 
  5. (Optional) Modify assigned Generators.
  6. Assign the DelimitedFileMaskingReceiver to the Domain.
  7. Configure the Receiver's Parameters (source file Location, etc.).
  8. Configure the Receiver Attribute Property Keys (columnIndex and include).
  9. Create a Scenario for the Domain and download it.
  10. Run the command at the command line to generate a masked delimited file. 
     

Receiver Parameters

The following parameters may be configured for the DelimitedFileMaskReceiver. Items with an asterisk* are required. 

  • sourceFileName* - Defines the name of the file to mask. 
  • sourcePath* - Defines the base path where the source file to be masked is located. 
  • sourceSubDirectory - Defines an optional subdirectory under sourcePath where the source file is located. 
  • destPath* - Defines the base path where the clean masked file will be stored.
  • destSubDirectory* - Defines an optional subdirectory under destPath where the clean masked file will be stored. 
  • delimiter* - Defines the delimiter. 
  • hasHeader* - Should be 'true' when the source file has a header and 'false' when it does not. The default value is 'true.'


Receiver Attribute Property Keys

The Receiver defines two property keys that can be modified on any of its associated Domain Attributes:

  • columnIndex - Specifies the column index as it will be in the file header. It should be a number value (e.g., 0,1,2,3,...). The first column should have an index of 0 and increase by 1 for each additional column within the source file.

  • include - Defines whether the Domain Attribute will be used for masking. If 'true,' the values will be masked for that Domain Attribute.


Receiver Attribute Property Keys Example

Three values will be masked within the source file: lastName, ssn, and phoneNumber (shown below).

 

In this example, the Domain contains all Attributes, but only three will actually be masked.

  • columnIndex - 2, 4, and 5 for the columns that will be masked.
  • include - lastName, username, and password are 'true' and will be masked.


Use Case 1 - Mask Last Name, Username, and Password

A tester wants to mask the Last Name, SSN, and Phone Number values within a delimited file with synthetic data. The source file is shown below: 


 

Step 1 - Create a Domain with Attributes

Create a Domain and add an Attribute for each set of values that must be masked. 



Step 2 - Remove the ID attribute from the Domain

An id Attribute will be created automatically when the Domain is created. This Attribute needs to be removed from the Domain if it will not be used. You can use the Trash Can icon to do that.



Step 3 - Change the Domain loopCount

The loopCount has been set to '15' because the source file contains fifteen records.


Step 4 - Add the DelimitedFileMaskReceiver and Configure Receiver Parameters

The DelimitedFileMaskReceiver has been added to the Mask Domain.

 

The following Receiver parameters have been configured as shown below: 

  • sourceFileName = SampleDelimitedFile.txt
  • destSubDirectory = maskedOutput
    • Note: The generated masked file will be placed in a subdirectory titled 'maskedOutput' located within the output directory.
  • delimiter = tab ( \t )


Step 5 - Configure Receiver Attribute Property Keys

  • columnIndex - The tester has entered 2, 4, and 5 for the Attributes. The column index skips from 0 to 2 and then skips 3 because the following columns have those index values and will not be masked in the output file:
    • firstName - index 0
    • middleInitial - index 1
    • age - index 3
  • include - set to 'true' for the last name, ssn, and phone number (when true, the value will be masked).

     

Step 6 - Create and Download a Scenario

A MaskScenario was created for the Mask Domain and can be downloaded using the Cloud icon. 



Step 7 - Run the genRocket -r Command to Generate the Masked File


Source File


Masked Output File