When can I share my data and with whom?
Whether you can share your research data with others depends on: 1. The anonymity of your data 2. Who owns your data 3. The infrastructure available to share the data
In this chapter, we will go into nr. 1 and talk about the EU privacy law: the General Data Protection Regulation.
The GDPR
Since May 2018, the General Data Protection Regulation (Dutch: Algemene Verordening Gegevensbescherming [AVG]) has been in place to better protect personal data. The most important aspects of the GDPR are:
- Privacy by Design: build privacy-increasing measures into your study design
- Privacy by Default: make sure your default settings already improve your participants' privacy
- Data minimization: Only collect and use personal data necessary for your research goal
- Legal basis: Make sure there is a legal basis (6 possible) to process (and share) the personal data you collect (e.g., informed consent or public interest more info (Dutch)
- DPIA: Conduct a Data Protection Impact Assessment whenever you collect (highly) sensitive data, such as names, addresses, race or health data.
- Inform participants about the goal of the personal data collection and which data you collect.
What is personal data?
Data is personal when you can identify someone by it, either directly (e.g., name, address) or indirectly (e.g., height, job, income, education). Indirect indicators are personal data if they can identify someone:
- when it concerns an extreme case (e.g., someone 2.20m tall)
- when combining data so that they can only be applicable to one person (NB. this can also concern publicly available data)
- when re-identification is still possible (e.g., with a name-number key conversion file)
By law, data is considered identifiable when identification can occur with reasonable (proportionate) effort. Thus, it is not about the hypothetical possibility that data can be linked or combined. Because not everyone has access to the same data, the definition of "identifiable" may differ per situation.
Important types of data
- Pseudonymous data: Data that is only identifiable with a key (that still exists). This is the case when after encryption, it is still possible to identify someone, e.g., because the key or the source data still exist. Pseudonymous data are still considered personal data, because the encryption is reversible, thus requiring a legal basis for processing.
- Special personal data: special sensitive categories of personal data that may be difficult to anonymize, they require additional measures:
- race of ethnic descent
- political views
- religion
- union membership
- genetic or biometric data aimed at unique identification
- health data
- sexual life and preference
- criminal records
- in the Netherlands: burgerservicenummer (BSN)
- Anonymous data: Data that are not (re)identifiable anymore: neiher by a name-number key, nor by combining with other publicly available data. Anonymous data are not considered personal data, so processing and sharing this kind of data do not require a legal basis.
Sharing data under the GDPR
Anonymous data can be shared without restriction if they are really anonymous. You may share non-anonymous data only when:
- You have attained explicit informed consent from the participant to do so (most used legal basis). For special personal data, this consent should be very explicit ("I agree to share x, y and z" with A, B and C): there cannot be any doubt about this. See some example sentences and a GDPR version of the Open Brain Consent initiative
- You reduce the amount of personal data shared to a minimum (data minimization principle)
- You take the necessary measures to protect your participants' privacy
- Always write a Data Management Plan (DMP) and Data Protection Impact Assessment (DPIA) before starting a project with personal data
- When sharing data with researchers outside of the EU, Norway, Liechtenstein and Iceland (no GDPR present), make sure that country has an adequacy decision. If the country does not have one, you need to take extra protection measures, such as standard contractual clauses or agreements.
In case your data are not anonymous, but you have attained consent and still want to protect your participants' privacy better, you may always use a data sharing agreement. This document contains what users can and cannot do with your data, for how long and if you will get credit if the user publishes about your data. A good example is the agreement used by the Donders repository. The Open Brain Consent initiative is also working on a template agreement, or find an example template in the template chapter.
Anonymizing data
General tips
- Remove identifiers (name, address)
- Replace identifiers (e.g., date of birth by age or age groups)
- Use pseudonyms (e.g., participant numbers)
- Randomize the pseudonyms (participant numbers)
- Use only the middle range of the data: extreme cases may lead to identification because by definition, there are only few of them
- Remove the name-participant number key
- Plan how to anonymize the data up front and keep a log of your procedures
- Store original data in a safe location
- Determine whether different measures combined could lead to identification. If needed, consult a privacy officer.
Deidentifying MRI-data
There is some debate as to whether or not MRI data can be anonymized. One paper, for example, found that brain morphology, although preprocessed, was personally identifiable (Takao, Hayashi, & Ohtomo, 2015). Moreover, it could be argued that, when combining multiple databases, the data may be identifiable in that way as well. Therefore, we do not speak of anonymizing MRI-data, but deidentifying it: MRI-data will always remain pseudonymous at best and therefore require a legal basis before sharing.
- Anonymize the filenames: replace names with codes
- Remove the header information (when using hdr and img files, not for nifti files)
- Deface the MRI-scans if your software does not do that automatically already. We recommend using pydeface.
If you are uncertain whether your data are anonymous, please don't hesitate to contact a privacy officer.
Have a look at this MRI data sharing guide for more info!
GDPR resources
- Open Brain Consent initiative, a bottom-up initiative to make sense of the GDPR in sharing MRI data
- A great overview of the GDPR and its practical implications (by Enrico Glerean, 2020)
- Course about privacy in research
- Privacy dos and donts
- Guide for sensitive data
- UU guides for handling personal data and informed consent
- Legal instruments protecting data (agreements)
Erasmus University contacts
- Privacy office ESSB: privacy [at] essb [dot] eur [dot] nl, or see this page
- Legal counsel: see this page
- Research support, e.g., data stewards: see this page
- IT-related questions: it [dot] servicedesk [at] eur [dot] nl
See all support staff here