Saturday, 2 February 2013

How to create Test Data?

Let's assume you have a very basic testing need.  You need to have around 50 customers created in your system for testing it.  Lets assume it is a web based application.  In fact, the concept is applicable to any technology/application.  So you have a customer creation screen as shown below.

So how do you create the test data that is required for you.

Basically there are 3 approaches to do it:
  • Manual approach
  • Functional Automation Approach
  • Database Approach

Manual Approach:

In the manual approach, you would manually feed the data in the screens and then create a customer. And similarly you would do this for 50 customers.  Needless to say the time taken to do it in a manual fashion is going to be big.

The time taken for the example application would be :
For 1 Customer = 1 min.
For 50 Customers = 50 mins.

Functional Automation Approach:

In the automated approach, you would automate the user interface (UI) for creating the data.  Thus you will effectively speed up the process of creating the required test data.  In our example, we would automate the web based UI using a Automation Tool such as QTP, RFT, Selenium, etc. and then data drive those tests to create the data that we require.

The time taken for the example application would be :
For 1 Customer = 10 seconds
For 50 Customers = 500 seconds = 8 mins.

Database Approach:

In all probabilities, you will have plenty of real-time customer information lying around in your production database.  So our job will be to query the right set of customers from the production database and load them into the test database.  Simple.  The data is ready to be used for testing.

Here in our example application, since its a pretty straightforward requirement, we would fetch the first 50 rows from the Customers table in Production and Insert those rows into the Customers table in Test Database.  The work flow will be as depicted below.

The time taken for the example application would be :
For 50 Customers = 60 seconds = 1 min (Just an example)

NOTE: The above example assumes that the back end is a Microsoft SQL Server database and hence the "SELECT TOP 50" query.

As you can see, the database approach is much faster than any of the other approaches.  The effort savings are enormous in a real time test data requirement as the data volumes are much higher. 

This methodology of creating test data directly from the Production data will form the corner stone and the building block of the concept called Test Data Management. Of course we are dealing with real time data and hence we need to secure the data before loading it into the Test Database, but we would deal all those topics in a separate post.

Hope the information was useful in giving a basic idea about Test Data creation.  I welcome your comments.  Cheers.

About the Author

Rajaraman Raghuraman has nearly 8 years of experience in the Information Technology industry focusing on Product Development, R&D, Test Data Management and Automation Testing.  He has architected a TDM product from scratch and currently leads the TDM Product Development team in Cognizant.  He is passionate about Agile Methodologies and is a huge fan of Agile Development and Agile Testing.  He blogs at Test Data Management Blog & Agile Blog.  Connect with him on Google+


  1. Thanks for the post.
    It was simple and easy to learn about test data management.

    Do testers are accessible for using database for using the real time data?

    Looking forward to see many posts further.

    1. Thanks Srinivas.

      Ideally speaking Testers will not get access to Test data as the scope of test data management will start even before the actual testing starts, in fact it starts right from the Test Planning stage itself. It is quite tricky in case of outsourced testing but the concept of TDM is gaining significant traction these days. I am planning to write more about these challenges as well in the future. Thanks for the support.

  2. Nice post and straight to the point. I always believe that an application or a system is nothing but a medium through which data flows from screen to screen, manipulates and presents to the user in a readable/usable format. Though the statement can be argued over, I accept it as it stresses the importance of test data management in a software development.

    We focus on two areas, firstly create new data using the system, achieved using automated tests as mentioned in your post. Secondly, because most production activity revolves around maintaining and servicing existing data, we have separate suite of tests to work on the existing data. To identify REAL issues, it is imperative we clone the prod data, scramble it and use it for testing as it is a true reflection of the production activity. However it comes with a pain as this cloning needs to be tested to ensure data integrity before commencing the testing. Looking forward to read your rest of your post to get more understanding of the TDM.

  3. Thanks Logu.

    You are right about the two areas. I am planning to cover each of these areas, the challenges that we are currently facing & strategies for each of them.

    Rajaraman R
    TDM Blog || Agile Blog