Wednesday, 27 February 2013

Commonly Used Data Masking Techniques - TDM

In my previous posts I discussed about Data Subset and Data Masking.  In this post, I will discuss the data Masking techniques that are widely used.  This is by no means exhaustive but will provide a general idea of the techniques that are available.

  • Random Substitution
    • In this technique, the value to be masked is replaced or substituted with a random value.  Depending on the nature of the random value, they can be further categorized into
      • Random Numbers
      • Random Dates
      • Random Seed Values For ex.
        • Names
        • Addresses
        • SSN Numbers
        • Credit Card Numbers
        • Telephone numbers
        • And a lot more
      • Random Alphanumerics
  • Algorithmic Substitution
    • Even though a random substitution technique is used, certain fields need certain algorithms to be followed.  For ex. a Credit Card number need to follow the mod-10 algorithm and an SSN Number should only be 9 digits in the following format AAA-GG-RRRR
      • A -> Area Code within US
      • G -> Group Code
      • R -> Random number
  • Sequence
    • This technique is to generate a sequence of data.
  • Selective Mask
    • Masking a selective portion of the data.  For example, altering only the domain name of an Email ID.
  • Nulling
    • This technique will null the column values to a Null value in the database.
  • Blurring
    • This is a technique of adding a random variance to the existing values.  This is mostly used for numeric fields for providing variations of the same data.  For ex.  producing a variation of 80% to 120% of the current salary values.
  • Custom Rules / Expressions
    • Certain fields can be more complicated to mask than the others.  For those fields, custom rules / expressions might be needed to satisfy those requirements.  For ex.  A bank account number might have the following rule for a customer account number - BBB-LLLLLL-AAAA
      • B -> Bank Unique Code
      • L -> Location / Branch code of the bank
      • A -> Account number
The technique of generating meaningful values for masked data is known as Intelligent masking.  This technique is widely used in today's data masking solutions.


Hope this post was informative.  Please feel free to comment.  Thanks for the read.





About the Author

Rajaraman Raghuraman has nearly 8 years of experience in the Information Technology industry focusing on Product Development, R&D, Test Data Management and Automation Testing.  He has architected a TDM product from scratch and currently leads the TDM Product Development team in Cognizant.  He is passionate about Agile Methodologies and is a huge fan of Agile Development and Agile Testing.  He blogs at Test Data Management Blog & Agile Blog.  Connect with him on Google+

4 comments:

  1. Your posts are very good and concise!

    ReplyDelete
  2. VIRUS REMOVAL

    Is Your Computer Sluggish or Plagued With a Virus? – If So you Need Online Tech Repairs
    As a leader in online computer repair, Online Tech Repairs Inc has the experience to deliver professional system optimization and virus removal.Headquartered in Great Neck, New York our certified technicians have been providing online computer repair and virus removal for customers around the world since 2004.
    Our three step system is easy to use; and provides you a safe, unobtrusive, and cost effective alternative to your computer service needs. By using state-of-the-art technology our computer experts can diagnose, and repair your computer system through the internet, no matter where you are.
    Our technician will guide you through the installation of Online Tech Repair Inc secure software. This software allows your dedicated computer expert to see and operate your computer just as if he was in the room with you. That means you don't have to unplug everything and bring it to our shop, or have a stranger tramping through your home.
    From our remote location the Online Tech Repairs.com expert can handle any computer issue you want addressed, like:
    • - System Optimization
    • - How it works Software Installations or Upgrades
    • - How it works Virus Removal
    • - How it works Home Network Set-ups
    Just to name a few.
    If you are unsure of what the problem may be, that is okay. We can run a complete diagnostic on your system and fix the problems we encounter. When we are done our software is removed; leaving you with a safe, secure and properly functioning system. The whole process usually takes less than an hour. You probably couldn't even get your computer to your local repair shop that fast!
    Call us now for a FREE COMPUTER DIAGONISTIC using DISCOUNT CODE (otr214423@gmail.com) on +1-914-613-3786 or chat with us on www.onlinetechrepairs.com.

    ReplyDelete
  3. For a list of data masking functions built into a tool for it, see:
    http://www.iri.com/products/fieldshield/technical-details
    For TDM generally, under http://tdminsights.blogspot.com/, please note IRI RowGen (which produces referentially correct test data without having to mask production data), and planning a test data environment here: http://www.iri.com/blog/test-data/tdm-primer/

    ReplyDelete
  4. Wonderful Blog!!! Your post is very informative about Data Management. Thank you for sharing the article with us.

    Hadoop Training Chennai |
    Big Data Training

    ReplyDelete