Showing posts with label Data Masking. Show all posts
Showing posts with label Data Masking. Show all posts

Thursday, 7 March 2013

Data Archive in Test Data Management (TDM)

In the previous posts, I explained about Data Subset, Data Masking, Test Data Ageing and Test Data Refresh.  In this post, we will focus on the topic of Data Archival and how important it is to the process of Test Data Management.

What does Data Archival typically mean?
  • Size Management
    • You would want to provide an efficient mechanism for the database size management.  Over time a database size grows and you need to actively manage it.
  • Archival of older data
    • Older data can be archived to some low disk space occupying area and can be later retrieved whenever needed

Types or Archive Mechanisms:

Monday, 4 March 2013

What is Test Data Ageing in TDM?

In our previous posts I explained about Data Subset and Data Masking in TDM.  In this post we will focus on Test Data Ageing.

This is useful for Time based testing.  Let's assume you create a customer and it requires 48 hours for activation of that particular customer.  What if you have to test the scenario that will occur after 48 hours? Will you wait till 48 hours for that scenario to happen for your testing? The answer is No.  Then how will you handle this scenario?

There are basically 2 approaches by which we can do this

  • Tamper the system dates
    • Although it is possible in some cases to tamper the system dates and continue with the testing, this method will fail if the date is generated by a database server or an application server instead  of the client.
  • Tamper the dates in the backend
    • This should be most viable and practical solution for such scenario.  In this approach, we modify the date at the backend so that it reflects the new date.  But care should be taken to ensure that data integrity doesn't get lost or the data semantics doesn't get lost.
This method of modifying the date according to the scenario needs is known as Test Data Ageing.  Depending on the scenario that needs to be tested, we can either Back date or Front date the given date.


Wednesday, 27 February 2013

Commonly Used Data Masking Techniques - TDM

In my previous posts I discussed about Data Subset and Data Masking.  In this post, I will discuss the data Masking techniques that are widely used.  This is by no means exhaustive but will provide a general idea of the techniques that are available.

  • Random Substitution
    • In this technique, the value to be masked is replaced or substituted with a random value.  Depending on the nature of the random value, they can be further categorized into
      • Random Numbers
      • Random Dates
      • Random Seed Values For ex.
        • Names
        • Addresses
        • SSN Numbers
        • Credit Card Numbers
        • Telephone numbers
        • And a lot more
      • Random Alphanumerics

Saturday, 16 February 2013

Data Masking in TDM

In my previous posts, I explained about the Challenges in Production Cloning.  One of the major challenges in the Production Cloning approach is Data Security.  This post will focus on the solution for Data Security, Data Masking.

As already explained, Data Masking is the process of masking the sensitive fields from the complete data set.    The whole objective of data masking is to ensure that no sensitive data is leaked into non-production regions like the Dev and Testing regions.

What are the sensitive fields that needs to be masked?  That basically depends on the project needs.  But some of the generic fields that need to be masked are:

  • Personal information like First names, Last Names, Email IDs, DOB, Phone & Fax numbers, SSN Numbers, National Insurance Numbers, Other national unique identifiers.
  • In Banking, Financial Services & Insurance industry - Bank Balances, Account numbers, Credit card numbers, Policy numbers, etc.
  • In Healthcare industry - PHI attributes like Medical record numbers, Member IDs, etc.

This list is by no means exhaustive, but will give a fair idea of how many fields are sensitive in nature that needs to be handled with care.  Any lapse in masking any of these fields might have a big impact on the Organization as a whole.

Challenges in Data Masking

Wednesday, 13 February 2013

Top 3 Challenges in using Production data in Test Environments

In my previous post "How to create Test Data", I explained the concept of creating test data directly from the production data.  In this post we will concentrate on the Top 3 challenges in using the Production data for testing purposes.

Data Security

This is by far the most crucial challenge of using Production data in Test Environments.  Production data can contain a lot of sensitive information.  Even though the data sets will be rich in nature in the Production database, the very thought of using production data involves a lot of risk.  For ex. if you are testing an application for a bank, production data will contain real customer information like Names, Addresses, Account Numbers, Balances, Credit Card Numbers, etc.  As you can see, if you try to use these data for testing, it exposes huge security risks for the bank. So how do we overcome this, the answer is Data Masking.

Data Masking is the process of masking of the sensitive fields from the complete data set.  Please read my future post on Data Masking and the Techniques used for Data Masking for more details.  The following figure depicts the data security challenge and the approaches.

Data Security Challenge