Using anonymization and pseudonymization to reduce the overhead of GDPR compliance.
Just uttering the words General Data Protection Regulation (GDPR) is almost enough to make grown adults faint.
The GDPR is the updated and more stringent version of the European Union Data Protection Directive 95/46 EC which goes back to 1995. Things have changed a lot since the nineties. For one thing, the Internet in 1995 was still a new and shiny thing that was sparsely used; I can’t even go back as far as 1995 on the Wayback Machine. Over the interceding years, much in terms of how we create, access, and share data have changed radically. Cloud repositories, digital identity, IoT, and mobile computing have changed that landscape beyond recognition. The GDPR is a reaction to those changes.
The GDPR was actually approved in April of 2014 but is coming into enforcement on 28, May 2018. The GDPR will mean that companies, not just in the European States, but across the world will have to look closely at their data security and privacy practices.
Feeling the fear of the GDPR
There is a lot of hype and fear about the GDPR. Going back to the nineties again, it reminds me of the FUD (Fear, Uncertainty, and Doubt) used by security vendors back then. Computer security, back in the day, was a hard sell. People didn't really care that much about it, it felt very geeky, and ‘somebody else’s problem’. So vendors created an atmosphere of FUD to sell security products. Some of the GDPR hype is FUD, but not all.
The GDPR has some very specific statements around security and privacy of data. I won't go into the details here, but give you a ‘feel’ for the expectations:
The territory it covers isn’t just EU states. If you process any data belonging to an EU citizen, you have to do so under GDPR rules. This white paper by law firm Bird & Bird contains a lot of detail on the territorial scope of the GDPR.
The GDPR is about citizen/consumer (data subjects) rights in terms of their data. These rights underpin the ethos of the GDPR, for example, data subjects must be able to:
- Give explicit consent to share data
- Revoke consent to shared data
- Have the right to be forgotten (this is especially tricky when data controllers have to ensure this is reflected across search engines)
- View and change their collected data
- Request their personal data, and then share this with another controller
- Privacy by Design is expected:
- Privacy Impact Assessments (PIAs) are to be done at early stages of a system design
- Data breach notification has been tightened up. It is now mandatory and within 72 hours.
- Fines have been increased. There are two levels of fines, depending on the misdemeanor: either 4% of global revenue or 20 million euros, which is the higher; or 2% of revenue, or 10 million euros whichever is the higher.
You need a dedicated person, a Data Protection Officer (DPO) to oversee compliance with the GDPR
There are a lot of things to consider and actions to take to get in compliance with GDPR. However, there are also some ways to minimize the amount of work needed to comply.
Anonymization or Pseudonymization: who will win?
GDPR is all about the data. How you collect it, how you process it, and how you handle access management. The GDPR recognizes de-identification of data as a good way to help minimize data leakage, and in doing so, applies exemptions to de-identified data.
Within this technique are two approaches that allow you to take advantage of these exemptions and reduce your GDPR overhead:
- Anonymization: is defined in Article 26 of the GDPR as the “data subject is not or no longer identifiable”. Anonymization is a high bar to achieve and requires specialist techniques. However, anonymization is becoming an essential part of health data stored within Cloud repositories and accessible to multiple teams as part of medical research.
- Pseudonymization: is defined in Article 4 of the GDPR as the “processing of personal data in such a manner that the personal data can no longer be attributed to a specific data subject without the use of additional information, provided that such additional information is kept separately”. Pseudonymization is the technique used to remove links between the data and the data subject.
The exemptions for de-identifying data offered by the GDPR either lighten the weight of the requirement (as in pseudonymization) or remove the need to comply with GDPR altogether (as in anonymization). Of course, the technical application of either method is a challenge in its own right.
Anonymization and identity
Anonymized or pseudo anonymized data is a two-edged sword and not always appropriate. However, we should strive to apply it wherever possible. For example, in consumer identity, it is possible to have an identity that has been checked against data sources, to establish the probability that the owner of the identity, is who they say they are, to a high degree. Once you have established that identity to a high degree, I would argue that there is no real value in needing to recheck data or even to need detailed attribute information. This means that the user’s data could be securely deleted from the system.
For example, a service, selling age controlled products, may need to know that the user is over 21. They should, however not be able to link this attribute back to an individual, but should be able to release resource access. If you need further information at a later date, for example, if the user moves to a new country, you can establish this change, privacy enhance the domicile attribute, and share under anonymized or at the least pseudonymised conditions.
The above scenarios fall short for marketing purposes. Identity data is a commodity, and knowing who you are marketing to may well be worth paying the GDPR price to ensure privacy. Having the option to de-identify data, is, however, important.
Original Source: CSO Online