The Anonymisation process in MIDAS project

Guest post by Roberto Bilbao from BIOEF

Processing and exploitation of large databases can offer multiple benefits to society as long as the respect for the rights of individuals, their privacy and the protection of their personal data are covered. In this context, the anonymisation of personal data takes on a special value as a formula that can ensure the big data research without undermining the respect for data protection.

The purpose of the anonymisation process is to eliminate or minimize the risks of re-identification of the anonymized data while maintaining the veracity of the results. As a result, anonymized data do fall out of the scope of data protection legislation. Several anonymisation techniques are available, such as noise addition, permutation, differential privacy, aggregation, k-anonymity, l-diverstiy and t-closeness. Each technique has its strengths and weaknesses.

In MIDAS project, we plan to anonymize the data of almost 900.000 people with heath information by breaking any chain of identification. It will be cross-checked this information with other anonymzed databases. In the design of the anonymisation process, it will be necessary to foresee the consequences of a possible re-identification of persons who could damage or diminish their rights. It will also be necessary to foresee a hypothetical loss of information due to the negligence of the personnel involved, due to a lack of an adequate anonymisation policy or a disclosure of the information, the loss of the identification variables or identification keys of the user’s data. As stated by the Article 29 Data Protection Working Party1, anonymization techniques can provide privacy guarantees and may be used to generate efficient anonymisation processes, but only if their application if engineered appropriately. This means that the context and the objectives of the anonymisation process must be clearly set out in order to achieve the targeted anonymisation while producing useful data. Therefore, we will pay special attention to:

  • patient re-identification and any privacy risks in the light of new technological developments;
  • the scientific utility of the published clinical data as a function of the anonymisation methodology used;
  • whether it is possible to successfully conduct a secondary analysis of the anonymised clinical data

Moreover, the national2  and international agencies’3 reccomendations regarding the anonymisation process will be considered.

References

  1. The working Party on the protection of individuals with regard to the processing of personal data. Opinion 05/2014 on Anonymisation Techniques. http://ec.europa.eu/newsroom/article29/news.cfm?item_type=1358
  2. Guidelines and guarantees for anonymisation processes. Spanish Agency for Data Protection. 2016

https://www.agpd.es/portalwebAGPD/canaldocumentacion/publicaciones/common/Guias/2016/Orientaciones_y_garantias_Anonimizacion.pdf

  1. European Medicine Agency Technical Anonymisation Group (TAG) 2017 http://www.ema.europa.eu/ema/index.jsp?curl=pages/regulation/general/general_content_001880.jsp