Skip to content

Can anonymous and pseudoanonymised data be considered sensitive and identifying under GDPR?

March 23, 2018

(This post fixates on a teeny technicality and doesn’t really have a conclusion, so proceed at your peril…)

As I said in my last post about GDPR: “GDPR assumes that I want to identify people from the information I hold.  I don’t.  I don’t need to and I actively try not to.  But it doesn’t seem to acknowledge that as a possible starting point, which makes some of the more complex elements either irrelevant or confusing.”

One of the key principles of GDPR is “Data minimisation – Personal data processed is adequate, relevant and limited to what is necessary.”  This means that I should go out of my way to ensure that I do not collect and store more data than I need to.  That’s OK by me, I have routinely been minimising data throughout my career, as the Market Research Code of Conduct requires that: “33. Members must take reasonable steps to ensure all of the following: […]  f. that personal data collected are relevant and not excessive.”

When it comes to GDPR, one way of minimising data is to remove the identifying elements.  In the eyes of GDPR it isn’t ‘data’ if it isn’t personally identifying.  Again, fine by me.  I have routinely been anonymising data or collecting anonymous data throughout my career, as the Market Research Code of Conduct requires that: “26. Members must ensure that the anonymity of participants is preserved unless participants have given their informed consent for their details to be revealed or for attributable comments to be passed on.”

GDPR is really keen on anonymising previously identifying data.  In fact, it terms this ‘pseudonymisation’.  It says: “The application of pseudonymisation to personal data can reduce the risks to the data subjects concerned and help controllers and processors to meet their data-protection obligations.”  And even better, the MRS says: “Once you are working with anonymised data (i.e. participant responses which cannot identify an individual), the requirements of the data protection rules are no longer applicable”

Basically, if I go out of my way to obtain, process and store my data so that individuals cannot be identified (which I do routinely already) my obligations under GDPR are minimal. 

Sounds good.

Thing is though, it looks like there are circumstances where my obligations should be minimal because my data is pseudonymised, but they are concurrently maximal because I the content contains ‘special categories’ which ‘merit higher protection’.

Special categories include “racial or ethnic origin, political opinions, religious or philosophical beliefs, or trade union membership, and the processing of genetic data, biometric data for the purpose of uniquely identifying a natural person, data concerning health or data concerning a natural person’s sex life or sexual orientation.”

Here’s what I’m thinking.

Imagine I am undertaking depth interviews, which I audio record for analysis purposes.  With the greatest respect to the respondents, their identity almost never matters to me beyond being able to call the right phone number and ask for the right person.  Although I have the names of my respondents in an Excel file for the purposes of making appointments, this is not linked to the audio files in any way.  I do not mention the respondent’s name at any point during the recording.  The audio files are saved as randomly assigned ‘Interview 1’, ‘Interview 2’, ‘Interview 3’ etc. rather than by name, as are any notes that I make or quotes that I use.  Verbatim quotes are reported thematically (out of order and mixed with others) and unattributed.  I keep the audio recording and notes and final report on file.  GDPR considers these files pseudonymised data.  It was always going to be pseudonymised, and I did so thoroughly and at the earliest point.

Great.  This is all good.  I did it this way anyway.  But, it appears that there are circumstances in which this can all be the case but the file is still considered ‘personal data’.

  • The files may be considered ‘personal data’ if I need to ask questions which constitute ‘sensitive personal data’ about the respondent’s racial or ethnic origin, political opinions, religious beliefs, trade union activities, physical or mental health, sexual life, or details of criminal offences. This is almost certain to happen during interviews about charity work.  Many charity projects focus on tackling one or more of these issues.  It would be an unusual interview about charity work that did not touch on wellbeing.
  • The files may be considered ‘personal data’ if the respondent randomly brings up something which constitutes ‘sensitive personal data’ relating to their racial or ethnic origin, political opinions, religious beliefs, trade union activities, physical or mental health, sexual life, or details of criminal offences. People go off on tangents, and especially in a one-to-one situation they tell you stuff about themselves.  You could be asking them about toothpaste and they’ll tell you they are Christian or they had been in prison or they suffered with fibromyalgia or they voted for Nigel Farage.  Ooops, suddenly it’s ‘sensitive’.

Similarly, imagine I am undertaking a web survey about taking part in a charity project, or handing out feedback forms after a session.  With the greatest respect to the respondents, their identity is not important to me so I do not ask them for their name or contact details at any point – in fact I have never known them.  I keep an excel file of responses on file and any paper copies in a secure archive box.  Reporting is conducted on aggregated data.  GDPR considers this anonymous data.

Again this is all good.  I did it this way anyway.  But, again it appears that there are circumstances in which this can all be the case but the file is still considered ‘personal data’.

  • The filled-in questionnaires may be considered to contain ‘personal data’ if I need to ask questions which constitute ‘sensitive personal data’ about the respondent’s racial or ethnic origin, political opinions, religious beliefs, trade union activities, physical or mental health, sexual life, or details of criminal offences. Again, this is almost certain to happen during surveys about charity work.

So I guess what I’m saying is that on my first readings it looks like I routinely hold a bunch of anonymous and pseudonymised data which the GDPR considers sensitive and identifying, even though it cannot be linked to a named individual.  Which is weird.

If this is indeed the case I’m not quite sure how to treat that data, either at the point of collection, processing or storage.  Do I need to do nothing, because it is no longer (and was never going to be) ‘data’, or do I need to collect a bunch of extra permissions and take extra precautions because of the special categories?

I’m confused at this point… to be honest it all appears to be pretty ‘low risk’ from the perspective of the data subject so I’m not hugely worried… but I am confused.

I don’t think GDPR expects me to be routinely pseudonymising everything already so it doesn’t really account for this.

I have signed up for a GDPR Q&A webinar with my professional body.  Maybe they will be able to clear this one up.

Advertisements
No comments yet

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: