Description

Synthetic Data Replacement (SDR) is a data masking technique that replaces sensitive data values with synthetically generated data values. 


Important: GenRocket does not have access to PII or store this data. This sensitive information remains securely within your environment, behind your corporate firewall. GenRocket does not require access to the actual data in production, making it a more secure solution. The Production data is never exposed or "read" - only the metadata. 


SDR Example

For example, you can use SDR to replace (i.e., mask) SSN values with realistic, patterned synthetic test data to perform a broad spectrum of data values to improve test coverage.

Test Data Scenario
GenRocket Synthetic Data Replacement
Sequential Replacement
Generates simple incrementing values (e.g., 001-00-0001, 001-00-0002)
Valid Format Replacement
Produces structurally correct values (e.g., 534-72-9823, 207-65-3490)
Character Sets
Injects invalid characters (e.g., @@@-##-****) to test input validation
Boundary Values
Outputs edge values like 000-00-0000 and 999-99-9999
Missing or Null Data
Replaces with nulls, empty strings, or whitespace as configured
Illegal Format Testing
Replaces with malformed values (e.g., letters in numeric fields)
Volume Stress Testing
Generates large volumes of synthetic data to simulate real-world scale


Advantages of SDR Over Traditional Masking Techniques

Sensitive Data 

  • remains secure within the production environments
  • is never touched or altered
  • is replaced by synthetically generated data


Synthetically Generated Data

  • is 100% secure
  • can yield more accurate data patterns than exist in production
  • can produce edge cases that do not currently exist in production


Where is SDR Used in GenRocket? 

  • G-Migration+ - Migrate subsets of data for one or more tables from a source database to a target database. SDR can be performed on selected data columns in each table during the migration process.

  • File Masking - SDR can be performed for sensitive data values within a source file (e.g., JSON, Delimited, XML). Values are replaced with synthetic data values in the generated file.