Overview

When a bug is discovered in your production environment, you have to become a detective and immediately seek answers to the following questions:

  • What production data became corrupted because of the bug?
  • Where in the source code does the bug exist?
  • What steps need to be taken to repair the source code?
  • What steps need to be taken to repair the production data?


Two Important Rules for Ensuring Bug Fixes Work in Production

If you want to truly ensure that your bug fixes will work with absolute certainty once pushed to your production environment, then there are two main rules you must follow when fixing production bugs:
  1. Never push repaired code into production unless it has been thoroughly tested and fully validated.
  2. Never use corrupted production data to test your repaired code.

Your first instinct may be to think that corrupted production data should be used to test the code you want to repair. However, using unknown, potentially corrupted data to test the repaired code is not recommended when validating your repaired code. You must have complete control of your test data in order to fully test your repaired code. Thus, you'll want to generate synthetic test data to test the repaired code and not use production data that may be corrupted.  


You may also need to copy your Production Database to a Test Database, not to test the repaired code, but rather to test the automated scripts you must write to repair the corrupted data within your Production Database.  


Recommended Methodology for Avoiding Broken Code and Corrupted Data

The following methodology described below defines three phases that, when followed, will facilitate getting your production environment repaired and free of the broken code and corrupted data.  The three phases are:

  1. Find the Bug
  2. Repair the Bug
  3. Repair the Corrupted Data  

Phase 1: Find the Bug

  1. Identify the data that was corrupted in production (normally one or more rows of data, within one or more tables, within one or more databases).

  2. Identify the code that caused the corruption (this code is where the bug exists).

Phase 2: Repair the Code

  1. Determine what modifications need to be made to the code.

  2. Write one or more stories that describe how you want to test and validate the new code.

  3. Repair the code that contains the bug.

  4. Use GenRocket Domains, Attributes, Generators, Receivers & Scenarios to model the data you need to test the repaired code.

  5. Use GenRocket Test Data Cases combined with Test Data Rules, and Test Data Queries to create the exact test data necessary to test the repaired code in numerous ways to attain maximum validation of the repaired code.

  6. Implement Unit Tests, that will use GenRocket Test Data Cases to generate data in real-time, and then test and validate the repaired code.

  7. Run your Unit Tests to validate the repaired code.

  8. Once the Unit Test successfully validates the repaired code, promote the repaired code to production.

  9. Add the Unit Tests to your automated regression suite to ensure automated real-time testing of the code on a daily basis and before every code release.  

Phase 3: Repair the Corrupted Data

  1. Write one or more stories that describe the steps necessary to repair the corrupted data in production.

  2. Write the necessary Repair scripts to repair the corrupted data in production.

  3. Write the necessary Validation scripts to check, validate, and report findings.

  4. Make a copy of the Production database (at least the subset of the data that is corrupted) to a Test database.

  5. Run the Repair scripts on the Test database.
     
  6. Run the Validation scripts on the Test database.

  7. Review the Validation report.

  8. Once Validation reports a finding of 100% validation, do the following: 
    1. Run the Repair Scripts on Production
    2. Run the Validation Scripts on Production

In Summary

While the three phases and inner steps defined above ensure a sound methodology for fixing production code and repairing production data, it should be noted that when GenRocket is used, with good automated testing practices where up to 98% of the code is eventually being thoroughly tested, the following benefits are received: 

  1. The event of bugs entering the production environment is drastically reduced. 

  2. The ability to implement new code and get new features to production is radically increased.

  3. Overall, the cost for implementing new features, bug fixing, and repairing corrupted production data is drastically reduced as well, sometimes by as much as 50%.