Hi,
I was just wondering why do we still having data privacy issues all over the big data even though the data has been hiding and cannot be manipulated?
Thanks!!!
Hi Eeshwar!
Thanks for reaching out.
The main reason for data privacy issues remain that sometimes the masking technique may not have worked so well and the hackers may have managed to obtain the real data. I.e. they may have managed to re-identify it. So, it’s basically a matter of how good a given masking technique is, in a certain situation, and whether it has worked well in that particular case.
This can also be classified as a data protection issue.
Hope this helps.
Best,
Martin
Thanks martin for clarifying the doubt to some extent, I am still wondering what kind of masking techniques could be used in order to protect the data could you please elaborate this with a real-time example if possible,I am still curious to know more on this?
Hi Eeshwar! There are many techniques and data masking is a separate and huge field on its own. The most notable among these techniques, probably, are encryption (where the authorised users need a key to access this data), substitution (where you mimic the look of the authentic data but you actually provide unauthentic data, which is, say, data that is very close to the original one), averaging (where you impute average values in the place of the different values in a certain column), character scrambling (a basic technique which works great in some situations), shuffling (about which you can learn more for in our program), and more. When you say “real-time example” I guess you are referring to Dynamic Data Masking (DDM), which is a relatively new technology that builds on “on-the-fly” data masking. The latter is an ETL (Extract Transform Load) process where the source of information, which is to be masked, is being specified (environment 1), and then the location where the masked data will be loaded is also being specified (environment 2). The same process applies for DDM, however not for transferring an entire data set, but one record at a time. Hope this helps. Best,Martin