In my previous posts I discussed about Data Subset and Data Masking. In this post, I will discuss the data Masking techniques that are widely used. This is by no means exhaustive but will provide a general idea of the techniques that are available.
- Random Substitution
- In this technique, the value to be masked is replaced or substituted with a random value. Depending on the nature of the random value, they can be further categorized into
- Algorithmic Substitution
- Even though a random substitution technique is used, certain fields need certain algorithms to be followed. For ex. a Credit Card number need to follow the mod-10 algorithm and an SSN Number should only be 9 digits in the following format AAA-GG-RRRR
- A -> Area Code within US
- G -> Group Code
- R -> Random number
- This technique is to generate a sequence of data.
- Selective Mask
- Masking a selective portion of the data. For example, altering only the domain name of an Email ID.
- This technique will null the column values to a Null value in the database.
- This is a technique of adding a random variance to the existing values. This is mostly used for numeric fields for providing variations of the same data. For ex. producing a variation of 80% to 120% of the current salary values.
- Custom Rules / Expressions
- Certain fields can be more complicated to mask than the others. For those fields, custom rules / expressions might be needed to satisfy those requirements. For ex. A bank account number might have the following rule for a customer account number - BBB-LLLLLL-AAAA
- B -> Bank Unique Code
- L -> Location / Branch code of the bank
- A -> Account number
The technique of generating meaningful values for masked data is known as Intelligent masking. This technique is widely used in today's data masking solutions.
Hope this post was informative. Please feel free to comment. Thanks for the read.
About the Author
Rajaraman Raghuraman has nearly 8 years of experience in the Information Technology industry focusing on Product Development, R&D, Test Data Management and Automation Testing. He has architected a TDM product from scratch and currently leads the TDM Product Development team in Cognizant. He is passionate about Agile Methodologies and is a huge fan of Agile Development and Agile Testing. He blogs at Test Data Management Blog & Agile Blog. Connect with him on Google+