Skip to content

How to clean your own Surveymonkey data

October 31, 2017

Second in an unexpected mini-series on the professional use of Surveymonkey, I’m going to get even more geeky… brace yourself…

Quantitative data is all about ‘facts’ and ‘hard stats’, but in order to present a set of objective data you sometimes need to use your subjective judgment to ensure that your dataset is as strong as possible.  But sometimes – by no fault of your own – your data isn’t strong because crap gets in there.  Respondents fill your survey in wrong, or give up half way through, or (shudder) lie, and all that crap makes your dataset weak.  So if you are analysing data from a survey, you want to make sure there is no crap in there cluttering your data up and skewing it unintentionally.

When I worked for global research agencies we had coders and analysts and systems in place for ‘cleaning’ the data and kicking out the dross, and the data would all be pretty much perfect before it came back to us Consultants to consider and report on.

As a sole trader I don’t have this resource, so it is up to me to make sure my data is as strong and useful as possible.  I do this by identifying weak/dodgy elements within the data and either correcting them or removing them from my dataset before I start my analysis and reporting.

I thought you might be interested to know what I do to ‘clean’ the data I collect from a typical Surveymonkey web survey.  You might like to be assured as a client that I am working with high quality data, or you might like to use my methods to clean your own data.

Sorry, it gets a bit technical!  Read on if you dare.

TIP 1: Delete those that have been screened out

Example: Say Q1 was “Did you take part in MyCharityEvent?” and you don’t need those that said “no”.

If you put in a screener question at the start of your questionnaire to make sure that you included the right people in your survey, you can now delete those that should have been screened out.  Surveymonkey includes them in your total responses and analysis graphs and so on, but you don’t need them for analysis purposes.

HOW TO: Go into the ‘Analyse results’ section, filter by ‘Q1 answer no’, click ‘individual responses’ and then delete each one.

TIP 2: Evaluate your partial responses

Example: Say a respondent filled in the first six questions of a twenty question survey, then gave up.

Sometimes people only fill in half of a survey before getting bored and giving up.  It is up to you whether you include these partial responses within your analysis and this may depend on how many responses you got in total as to whether you can afford to lose the partials.  But the data is stronger if it is based on a complete response set, and if you put your demographic section at the end (which I think you should) your partials’s probably skipped this bit which might limit their cross-analysis potential.  I’d take these on a case-by-case basis, looking through each partial response to check whether you consider it ‘complete enough’ to be useful.  If it is ‘complete enough’ there are things you can do to increase the usefulness. If it does not look useful (for example if they only answered two out of ten questions) you might want to delete it.

HOW TO: Go into the ‘Analyse results’ section, filter by ‘completeness’ and ‘partial responses’.  Click ‘Show’ and select three random survey pages: one near the start and middle and end of the survey.  Now go to ‘individual responses’ and scroll through these.  If these three questions contain answers (i.e. they do not say ‘Respondent skipped this question’), this indicates a useful response because it means they have filled in most of the survey.  If they have filled in most of the survey, click ‘Edit’ and have a look at the detail.  Maybe they have simply failed to click ‘done’ at the end, in which case you can do this for them.  Maybe they have only skipped a couple of answers at the end and if you back-fill these with Prefer not to say / not applicable / Don’t know responses then you can complete the survey.  When you’ve done that, see what remains consider deleting the genuinely partial responses if they do not appear useful for analysis.

TIP 3: Back-code your free text

Example: Say you asked “Where did you find out about MyCharityEvent?” and the respondent wrote in “Facebook” when really you intended them to tick “Social media” from your list of response options.

Sometimes people write things into your ‘Other please specify’ boxes that really should have been ticked from your response options.  You can ‘back code’ these answers by going back in and ticking the appropriate box, so that you can include them in the analysis.

HOW TO: First, have a look at the list of text responses by going to the ‘Analyse results’ section, scrolling to the appropriate question and clicking on the ‘responses’ link at the bottom of your list of response options.  Click on ‘view respondent’s answers’ next to the one you want to change (i.e. in this case Facebook), click ‘edit’ and then flick through the survey until you get to the right question.  Tick the box you want to add (i.e. in this case Social media), and click ‘next’ for the next page.  This will now be saved.

TIP 4: Spot and evaluate rogue responders

Example: Say someone fills in your survey with lies or nonsense for their own enjoyment.

Some people like to lie on surveys for whatever reason, or enter a load of responses at random.  These are hard to spot but there are some things that you might notice while you are doing the other checks on the data. Did the respondent complete the survey unusually quickly?  Are their free text responses inappropriate in some way (i.e. offensive, nonsense, or indicative that they are lying)?  Does logic tell you they might be lying (e.g. They gave two answers that contradict each other, or said they did something that you know they could not have done)?  If you spot one, delete their whole response!

HOW TO: Go to the ‘Analyse results’ section, and click on ‘view respondent’s answers’.  Find the questionnaire for the answer you are concerned about.  Read though the whole questionnaire carefully and evaluate it for sense, logic, and consistency.  Look at ‘Time spent’ in the info box at the top to see if they might have respondent too quickly and without due thought.  Decide whether you want to include the response, and if not you can delete them.

TIP 5: Change into your data processor hat

You’ll note from my tips here that a lot of these methods require you to look back at individual survey responses.  When I’m reporting on quantitative research I have no interest in individual survey responses, and no need to look at them one by one.  The value of quantitative research is entirely in the aggregate and with my researcher hat on I rarely do it.  BUT!  As a sole trader I’m not just the researcher, I’m also the data processer.  And as the data processer sometimes I need to go back to the original source to ensure that my data is fit for purpose.  So with your data processer hat on, a few last tips:

  • Be absolutely sure before you change or delete something, as you don’t want to have wasted a genuine respondent’s time and input.
  • Never change the spirit or tone of what the respondent said, or you will skew the data.
  • If you’ve done data entry from hard copies, you can always go back to the originals to double check things. Make sure you have numbered them so you can do this!
  • If you’re looking at individual questionnaires, treat the data and identifying information with all of the care and respect that it morally and legally deserves.
  • If in any doubt, you might like to save a complete data set before you start this process.

Good luck!

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: