In an open research environment where the publication of FAIR data is considered best practice, researchers working with data involving human participants must navigate the delicate balance between transparency and privacy, particularly when dealing with sensitive information that falls under the purview of the General Data Protection Regulation (GDPR).
GDPR applies to the processing of personal data in the context of the activities of an establishment of a controller or a processor in the Union, regardless of whether the processing takes place in the Union or not.
For the purposes of the Regulation: “‘personal data’ means any information relating to an identified or identifiable natural person (‘data subject’); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person;”
(Source: Data Protection Commission)
First and foremost, researchers should adopt a robust data de-identification strategy to safeguard the privacy of individuals involved in the study. This involves removing or altering any identifying information, such as names or addresses, from the dataset before publication. Additionally, researchers should carefully assess the potential re-identification risks associated with the shared data, considering both direct (e.g. name, address, date of birth) and indirect identifiers (e.g. job title, postcode, sexuality).
Striking a balance between making valuable data accessible to the scientific community and protecting the privacy rights of participants is crucial for maintaining ethical standards in an open research framework.
Certain principles for data processing must be observed:
Lawfulness, fairness and transparency
Purpose limitation
Data minimisation
Accuracy
Storage limitation
Integrity and confidentiality
The Health Research Data Protection Network (HRDPN) has developed a Practical Guide on Data Protection for Health Researchers.
For more information, please contact the TU Dublin Data Protection Office.
Informed Consent is a voluntary and well-informed formal agreement obtained from individuals before their participation in a study. It involves providing participants with clear and understandable information about the research purpose, procedures, potential risks, and benefits. Crucially, informed consent ensures that participants are aware of their rights, including the right to withdraw from the study at any time without facing negative consequences.
Researchers need to implement clear and transparent data sharing policies that explicitly communicate the purpose and scope of data usage to participants.
Informed consent becomes a critical aspect in this context, requiring researchers to obtain explicit and specific consent from participants for sharing their data openly. This involves providing participants with comprehensive information about how their data will be used, who will have access to it, and the potential risks involved.
The DARIAH ELDAH Consent Form Wizard can help you develop a compliant participant consent form. (Note that this service is primarily aimed at humanities research.)
Researchers must carry out a Data Protection Impact Assessment (DPIA) when planning to conduct a project that involves the processing of personal data and when the nature, scope, context, and purposes of the processing are likely to result in a high risk to the rights and freedoms of individuals (Article 35 GDPR).
For more information, please contact the TU Dublin Data Protection Office.
“"Pseudonymisation" of data means replacing any identifying characteristics of data with a pseudonym, or, in other words, a value which does not allow the data subject to be directly identified.
‘”The GDPR and the Data Protection Act 2018 define pseudonymisation as the processing of personal data in such a manner that the personal data can no longer be attributed to a specific data subject without the use of additional information, provided that (a) such additional information is kept separately, and (b) it is subject to technical and organisational measures to ensure that the personal data are not attributed to an identified or identifiable individual.”’ (Source: Data Protection Commission)
(Source: Statice.AI.)
“Anonymisation" of data means processing it with the aim of irreversibly preventing the identification of the individual to whom it relates. Data can be considered effectively and sufficiently anonymised if it does not relate to an identified or identifiable natural person or where it has been rendered anonymous in such a manner that the data subject is not or no longer identifiable.” (Source: Data Protection Commission)
Below are some useful resources and guides to assist with this process:
McGill University Data Anonymisation Workshops (video content)
Amnesia (data anonymisation tool)
ARX (data anonymisation tool)
In short, pseudonymised data is re-identifiable, and anonymised data is not.
Researchers should also establish robust data governance frameworks, using a Data Management Plan, to ensure compliance with GDPR regulations, outlining how data will be stored, accessed, and managed throughout its lifecycle. By fostering a culture of responsible data sharing and prioritizing privacy considerations, researchers can contribute to the advancement of science while respecting the rights and confidentiality of those who participate in their studies.
It is important to note that in addition to compliance with GDPR, researchers in TU Dublin are also required to comply with any relevant institutional policies, e.g., the TU Dublin Data Protection Policy, TU Dublin Open Access to Publications Policy.
The Open Research Support Unit provides training on GDPR & Open Research. Please contact us.
“To determine whether a natural person is identifiable, account should be taken of all the means reasonably likely to be used, such as singling out, either by the controller or by another person to identify the natural person directly or indirectly.
To ascertain whether means are reasonably likely to be used to identify the natural person, account should be taken of all objective factors, such as the costs of and the amount of time required for identification, taking into consideration the available technology at the time of the processing and technological developments.” (Source: General Data Protection Regulation)
This work is licensed under CC BY-NC-SA 4.0