Description

The MultiWeightCSVGen Generator allows multiple values to have a weighting to determine how often each value is generated. Values and their respective weights are read from a delimited file at runtime.


For example, if valueList contained (economy, mid size, luxury) and the percentList contained (50, 35, 15): 

  • the economy would be generated 50% of the time,
  • mid size would be generated 35% of the time, 
  • luxury would be generated 15% of the time.


Note: It is best to have the percentages add up to 100%, but the Generator has been designed to allow percentages greater than 100% for the rare cases where that is needed. 


In This Article


Generator Parameters

The following parameters may be configured for the MultiWeightCSVGen Generator. Items with an asterisk* are required. 

  • filePath* - Defines the path where the delimited file is located. 
  • fileSubDir - Defines an optional subdirectory under the filePath where the delimited file exists.
  • fileName* - Defines the name of the delimited file to open.
  • delimiter* -  Defines what delimiter is used to separate column data within the delimited file. 
  • percentColumnName* - Defines which column represents the percentages in the delimited file to read data from. 
  • valueColumnName* - Defines which column represents the values in the delimited file to read data from. 
  • seed - Using the seed will ensure that the same random data is generated each time data is generated.
  • exactPercentage* -  When 'true,' the random distribution matches the specified percentages exactly.  If set to 'false,' the random distribution is determined in real-time and may not match exact percentages. The larger the distribution, the more exact the percentage when 'false.'
  • list - Stores one value on each line in the list. To add values, type them in and hit ENTER. Note that the listed value shown is only for simulation mode; when running a true Scenario, the data will be loaded from the specified resource.
  • totalLoopCount - Defines the total number of rows to be generated by the Scenario.


totalLoopCount Parameter

This parameter should be used when data is generated dynamically (i.e., randomly) for a child. For example, a user has many addresses based on defined logic. In this instance, the Scenario does not know the total loop count in advance, and the MultiWeightCSVGen Generator does not work to divide the percentage properly. 


 

The totalLoopCount Parameter can be configured so that the percentage divide of data is done properly based on the entered value.



Use Case 1 - Generate Car Types Based on CSV File Percentages

For this example, a CSV file will be used to determine the percentage of time three different car types are generated within the output. Based on the following CSV file, the following car types should be generated as follows: 

  • economy - generate 50% of the time
  • mid size - generate 35% of the time
  • luxury - generate 15% of the time



Domain Setup

A CarRental Domain has been created with these Attributes: 

  • type - generates car type based on CSV percentage values using the MultiWeightCSVGen Generator
  • numberDays - generates the number of rental days using the RandomNumGen Generator
  • discount - generates a discount value applied to rental using the RandomMoneyGen Generator


The Domain loopCount has been set to '100', a DelimitedFileReceiver has been added, and a Scenario has been created:


MultiWeightCSVGen Generator Configuration

The MultiWeightCSVGen Generator has been assigned to the type Attribute.


It will read from a CSV file titled 'MultiWeightCSV.csv' and generate the exact percentages as defined in the file. The following parameters have been configured: 

  • filePath - set to #{resource.output.directory} organization resource path
  • fileName - entered the name of the file (i.e., MultiWeightCSV.csv)
  • delimiter - entered a comma (,)
  • percentColumnName - entered 'Percent' as defined in the file for the percentage value
  • valueColumnName - entered 'Type", which is defined in the file and will be the retrieved value
  • exactPercentage - set to 'True'




Sample Output

For this example, the generated count for each type across the 100 generated records was: 

  • economy - 50 times
  • mid size - 35 times
  • luxury - 15 times

Note: Below is a snippet of the generated delimited file. The entire output file has been attached to the bottom of this article for viewing.