The EU General Data Protection Regulation 2016/679 goes into effect in May of 2018 and has the potential to be very disruptive to Information Storage and Management in addition to the many other aspects of business. Much has been written about the 99 articles in GDPR and how they must be addressed, but the significance on Information Technology for implementing Article 17 may be the most impactful.
Article 17 is the “right to erasure” which is commonly referenced as the right to be forgotten. Briefly, this means that an individual, at no cost to themselves, can request to the data controller that all personal data pertaining to them must be erased without undue delay. This means all data – all files, all personal data in a database, all replicated copies, all backup copies, and any copies that may have been removed into an archive must have the individual’s personal data erased. When an Information Technology professional understands that requirement, early retirement becomes the immediate consideration.
A few important points to consider:
Data controller was mentioned earlier and needs to be understood as well as the term data processor. A data controller is “the individual or legal person who controls and is responsible for the keeping and use of personal information on computer or in structured manual files”. This means the Information Technology function in a company or organization. Data processor is the group/organization that “hold or process personal data, but do not exercise responsibility for or control over the personal data”. This would be what we would define as a cloud where the processing is done and/or data is stored or an IT data center whether internal or outsourced. The data controller is responsible for deleting the personal data and assuring it has been erased. The data processor is responsible for executing the operations but not for the decision process. The data processor cannot hold copies of data or make them available for other uses.
Implications for IT
The impact of tracking down all copies of data and erasing specific individual’s personal data seems almost impossible. Consider the simpler case of personal data in a database. How many copies of that database exist and where are they? How many DBAs have made extra copies for testing and extra protection? This looks to be an intensive, time-consuming task. In addition, it is not a revenue producing function.
No specific solution exists in general usage today. Using backup catalogs is one consideration but is incomplete because of the ability for other copies to be made outside of the visibility of the backup or copy data management solution.
An approach put forward by some application vendors that seems to be a practical solution is to encrypt each individual’s personal data and have a person-specific encryption key. Only the application software would have the visibility/knowledge of what is personal data to control the encryption. This would be an effective means for erasure, where the required erasure could be effected for all copies made by destroying the personal encryption key. This would eliminate the need to process all copies – backup, replicated, privately held, etc. for the erasure.
There are obvious problems with the approach. There would be application changes required and issues with data that is shared between applications or used for other purposes. These problems may be the least impactful compared to all others. New processor capabilities to do encryption including the IBM z14 with new, high-performance encryption and Intel Skylake x86 technologies remove the performance impacts for applications.
Encrypting data at the application level where the content is understood makes sense but there are downstream effects. The data manipulation done in systems without content knowledge such as compression and deduplication would be significantly impaired if not eliminated. The increase in amount of data stored for a given capacity by use of data reduction by those systems would be lost and more storage capacity required. Data reduction could still be accomplished but would have to move up to the application, prior to the encryption to have the same effect. Discovery of information about data stored would also have to move to work with or through (APIs) the applications rather than trolling the data itself.
The magnitude of the problem to meet the EU GDPR regulations overall is major, and Article 17, the right to erasure, is almost overwhelming. There may be some halfway approaches proposed or delivered or some that only work in certain cases but data controllers (IT personnel) should be cautioned from using an incomplete approach. That ultimately may be more costly to implement and, may still result in the extreme fines when the incomplete nature is exposed. The impending date is a hard date – no staged introduction or warnings. A strategy needs to be developed now and implementation planned.