Skip to content

It’s all about the base (aka don’t use Surveymonkey graphs)

October 9, 2017

In quantitative research we ask lots of people lots of questions and we look for patterns within the data.  To do this in a robust way, ideally we do all of our analysis and manipulation within a fixed framework of data so that when we are presenting findings, we are comparing like with like.

Let’s pretend I’ve done a survey about animals and 427 people filled it in.  When we compare like with like we can present things such as:

We asked 427 whether they like cats, and 62% said they did.

We asked THAT SAME 427 whether they like dogs, and 67% said they did.

However, only 32% of the 427 said that they liked both cats and dogs.

Thing is, with a self-administered web survey it is absolutely typical that some respondents skip a question or drop out and you end up with say 427 people answering Q1, and 421 answering Q6, and 412 answering Q12.  When this happens, it would be misleading to directly compare the percentages of people responding to Q1 and Q6 and Q12 because we would not be comparing the same respondents with the same respondents.  We would not be comparing like with like.

A little niggle that I have with Surveymonkey (which is what prompted me to write this post) is that it automatically analyses your data for you and presents it as attractive graphs – and I imagine it can be super tempting to use these as the output for your research.  But unless you look very closely, you might not notice that these graphs may NOT be comparing like with like because Surveymonkey bases each question on who answered it rather than who was supposed to answer it.

Here’s Q6 from my imaginary animal survey above, and you’ll see that it tells you that 421 people answered it and 6 skipped it.  Fine.  But it sneakily doesn’t tell you that it has done all of the calculations based on 421 respondents when you may well prefer it to have done them based on the full 427.


One way around this is to avoid it at the point of response by setting up your survey to ‘require an answer’ to every question.  This can put your respondents off and increase drop-outs, so you’d need to do the following anyway…

Another way around this is to filter on ‘completeness’ and discard all ‘partial’ responses.  With the survey above, you’d look for the question with the lowest response rate (in this case Q12, 412) and you’d keep the responses from those specific 412 people and you’d bin the rest.  However, if you’ve not got a very high response rate this isn’t always ideal.

So instead, every time I analyse my data I check and possibly re-base every question on every filter.  What I mean is, I decide what my ‘base’ size is (i.e. how many people answered the survey) and I personally calculate all of my percentages based on this.  So, if 427 people answered Q1 it is likely that I will consider 427 my base.  Where question response differs from this base I add in a new category (‘no response’) and re-do all of the percentages.

So for Q6, I’d add a ‘no response’ category of (427 minus 421) 6 and then re-calculate everything.  You’ll note, you get very slightly different results.


I do this every time for every question on every filter, even if the numbers are just one or two out.  It is a fiddly pain in the ass. It is the right thing to do in order to present robust data that is not misleading.

Personally, I like to further mess with my Surveymonkey graphs to make them look visually better and to ensure they are presenting high quality information in other ways.  Here, for example, I have changed the vertical axis to whole numbers (as I think non-integer percentage points are almost always unnecessary).  I have also extended the axis to reach 100% and changed the colour which are both in their own ways methods of presenting questions as like with like.  Whether you follow my lead and re-base everything or not, it is always always ALWAYS good practice to correctly label your graphs (and tables) including noting the base size so that the reader can properly understand it.


Basically, I’m advising you to do your own graphs rather than nicking them straight out of Surveymonkey.  You’re better than that.






No comments yet

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: