Data Masking in TDM

In my previous posts, I explained about the Challenges in Production Cloning.  One of the major challenges in the Production Cloning approach is Data Security.  This post will focus on the solution for Data Security, Data Masking.

As already explained, Data Masking is the process of masking the sensitive fields from the complete data set.    The whole objective of data masking is to ensure that no sensitive data is leaked into non-production regions like the Dev and Testing regions.

What are the sensitive fields that needs to be masked?  That basically depends on the project needs.  But some of the generic fields that need to be masked are:

  • Personal information like First names, Last Names, Email IDs, DOB, Phone & Fax numbers, SSN Numbers, National Insurance Numbers, Other national unique identifiers.
  • In Banking, Financial Services & Insurance industry - Bank Balances, Account numbers, Credit card numbers, Policy numbers, etc.
  • In Healthcare industry - PHI attributes like Medical record numbers, Member IDs, etc.

This list is by no means exhaustive, but will give a fair idea of how many fields are sensitive in nature that needs to be handled with care.  Any lapse in masking any of these fields might have a big impact on the Organization as a whole.

Challenges in Data Masking

Challenges in Production Cloning approach

In my previous articles, I have already discussed the topics "How to create Test Data" and "Top 3 Challenges in using Production data in Test Environments".  In this post we will focus on the challenges that we face in Production Cloning approach and how to overcome those challenges.

1.  Infrastructure

Even though it is highly recommended to have the Test Environment in the same lines as Production, it is not always feasible to test under those real-time conditions.  It is highly recommended to do Performance / load / stress tests exactly mimicking the Production database, but the expensive infrastructure requirements might be an overkill for Functional Testing.  But cloning might force you to have production like infrastructure which will translate into higher costs for the customer.

2.  High Storage Costs

Another major challenge associated with Production Cloning is that all the production data needs to be stored in testing region.  Assuming the production data is 50 TBs (Terabytes), the Test Database also needs to hold 50 TBs of data.  So storage has to be provided for storing all of the data.  And with the databases being backed up regularly, that would mean higher storage costs for the customer.