Description

This article provides a basic use case example of performing Synthetic Data Masking (SDM) when migrating data between two identical MySQL databases using the G-Migration+ feature. 

GenRocket uses Synthetic Data Masking or SDM to mask sensitive data values when migrating data with the G-Migration+ feature. With SDM, synthetically generated data is inserted into the destination database for table columns selected as sensitive columns. 


In This Article


User Story

A power user (e.g., Test Data Engineer/TDE) wants to mask sensitive data in three related tables in a MySQL database before inserting the data into the destination database.


Actors

  • Source MySQL Database (source_db)
  • Target MySQL Database (target_db)
  • G-Migration+ Feature
  • XTS File
  • Config.properties Files


Source Database Tables/Records

  • Customer - 1000 Customer Records
  • Address - 1 Address Record per Customer Record (1000 total)
  • Account - 1 Account Record per Customer Record (1000 total)


Synthetic Data Masking (SDM) Requirements

  • Customer Table Columns
    • last_name
    • dob
    • ssn
    • phone_number

  • Account Table Columns
    • account_number


Entity Relationship (ER) Diagram


User Steps

The user must complete the following steps to mask the required data and migrate it to a destination database.


Step 1 - User Creates Source and Destination Database Property Files

Configuration files must be present to connect to the source and destination databases via JDBC. Two properties files must be present on the user's local computer and contain the appropriate database connection information. For example, they can be called source.properties and destination.properties


These files can be placed within a subdirectory and must be present to connect to each database during the data migration. For more information on how to create these configuration files, please click here.


For this example, the following files will be used and placed in a Config subdirectory within the path specified for the user's resource.JDBC directory resource in the GenRocket web platform.


source_db.properties

driver=com.mysql.jdbc.Driver
user=root
password=root
url=jdbc:mysql://localhost:3306/source_db?rewriteBatchedStatements=true
batchCount=1000


target_db.properties

driver=com.mysql.jdbc.Driver
user=root
password=root
url=jdbc:mysql://localhost:3306/target_db?rewriteBatchedStatements=true
batchCount=1000


Note: Please note that 'root' has been entered for the user name and password information. The user will need to use each database's actual user and password information.

Step 2 - User Creates XTS File for Tables in Source Database

A table schema must be created using the genrocket -xts command. For this example, it will be created only for the customer, account, and address tables. 


Sample Command (Windows Machine)

genrocket -xts C:\Users\username\config\source_db.properties customer,account,address


Sample Command (Mac OSx/Linux)

genrocket -xts /home/users/username/config/source_db.properties customer,account,address

Created File


Note: Users can specify a schema, tables, etc., as part of the genrocket -xts command. Click here to learn more.


Step 3 - User Creates a Project with Default Project Version

A project with the default project version must be created and available to mask sensitive data values with synthetic data values prior to database insertion.
 


Note: Click here to learn more about creating Projects in the GenRocket web platform. 


Step 4 - User Imports Table Schema using created XTS File

  • The user selects G-Migration+ in Self Service Menu for Project Version.
  • The user clicks the Import Table Schema button and selects the XTS file. 
  • The XTS is imported, and the user can view the Table Schema.


    Note: Click 
    here to see the complete table schema import steps. 


Step 5 - User Creates a G-Migration+ Configuration

  • The user clicks the Create G-Migration+ Config button.



  • The user enters/selects the following information: 
    • Name = MaskingOnlyDemo
    • Description = Mask Sensitive Data Values
    • DataDumpSubDir = dump
    • Partition Receiver = MySQLPartitionReceiver
    • Source Database
      • Name = source_db
      • JDBCSubDirectory = Config
      • JDBCConfigName = source_db.properties
    • Destination Database
      • Name = target_db
      • JDBCSubDirectory = Config
      • JDBCConfigName = target_db.properties
  • The user clicks Save.


Step 6 - User Adds Customer and Account Tables

Next, the user adds the Customer and Account Tables to the G-Migration+ configuration.

  • The user clicks the Manage Tables button.


  • The user selects the Customer and Account Tables.
  • The user clicks Save.


    Note: Click here for step-by-step instructions on how to add tables and select sensitive columns for SDM.

Step 7 - User Selects Sensitive Columns in the Customer Table

  • The user clicks the checkboxes to select sensitive data columns (i.e., dob, ssn, last_name, phone_number).

  • The user clicks Save and Next.


Step 8 - User Selects Sensitive Columns in the Account Table

  • The user selects the account_name column within the Account Table.

  • The user clicks Save.





Step 9 - User Downloads G-Migration+ Configuration and Scenarios 

  • The user downloads the G-Migration+ Configuration and required Scenarios (Default Scenario, Customer Scenario, Account Scenario).



  • The files will be downloaded as a single zip file. Users can extract the files and move them to the required locations.




  • The user places the Default Scenario, Customer Scenario, and Account Scenario in the path defined for the resource.output.directory.



  • Please note that this is the default path that is automatically configured for the chosen Partition Receiver (e.g., MySQLPartitionReceiver) and is where it will look for the Default Scenario.


Step 10 - User Runs Command at Command Line

The user enters the following command using the name of their G-Migration+ configuration: 

Generic


 

genrocket -gmigp <G-Migration+ Name>.gredb

Example

genrocket -gmigp SubsettingOnlyDemo.gredb


Synthetic Data Masking (SDM) Results

For this use case, all customer, account, and address records will be migrated to the destination database (1,000 records for each table). Data values in sensitive columns will be masked with Synthetic Data before database insertion. 


source_db Customer Table


target_db Customer Table

The last name, dob, ssn, and phone number of each customer have been masked with random synthetic data: 



Note: This is a simple example. Users can modify Generator parameters before migrating to achieve specific data patterns.


source_db Account Table


target_db Account Table

The account number has been masked with synthetic test data: