Data Visualisation by Dummies

While I’ve spent most of the last six years selling the dream of self-service data visualisation tools, there are some days I wish I hadn’t.

It’s not because there is anything wrong with the tools (I’m not going to mention any by name, for fear of offending somebody by leaving them out) – it’s just a problem of the users.

You see, most of us don’t know anything about how to visualise data. But give us a tool that allows us to produce something that looks pretty or looks professional, and off we go – pretending we’re the next Stephen Few.

You would be appalled to go to the Met or the Tate and see artwork from amateurs with no skill or creativity. You would be disgusted to pay at a restaurant for a meal made by an incompetent chef. You wouldn’t accept a novel written by an illiterate author. Why then, do we accept the presentation of data visually in such appalling ways which in the worst cases aren’t just plain wrong – they are misleading.

Those of you who used to work with me at BIPB will remember my ‘WTF Visualisation’ class which I ran as part of our graduate training programme. Aside from my attempt to provide some light relief from a heavy week of data modelling for our newest team members, the point of these classes were to encourage the analysis of data visualisation and to spot glaring errors before they happened in the field.

Here are a couple of recent examples (both taken from ):

  Flagrant abuse of the Y-axis  Image  source

Flagrant abuse of the Y-axis
Image source

The point of a data visualisation is to help the viewer reach a conclusion immediately. Here, you would be forgiven for thinking that public sector jobs have overtaken private sector jobs over the last five years. Statistics Canada are however unforgivable for their flagrant abuse of the Y-axis in distorting the information presented. As any of my data visualisation mentees will know - like for like comparisons with split scale axis are never a good idea, particularly if they are at different scales, as in this example. A truncated axis (as in, one that doesn't start from zero) needs to be shown with a Z symbol, and never ever ever mix colour and style in your plots without running risk of being accused of bias (solid orange line, vs dotted green line).

  Accidents waiting to happen... no, wait, they already did!  Image  source

Accidents waiting to happen... no, wait, they already did!
Image source

Anyone who knows me will also know that I hate few things less than peanut butter and pie charts, but one thing that almost always gets my goat is Infographics.

That's not because I don't have respect for the designers who slave away for hours in producing them, but it's because they seem to forget that there is a difference between something that looks good and something that's designed to communicate information. I'll forgive an ugly infographic that's communicative, but I'll never forgive one that's both ugly and useless. This example from the Guardian above is such an example.

So, what can you do to avoid these mistakes?

Firstly, always ask yourself the question before you begin a data visualisation as to what it is you are trying to achieve. Even if you don't know what you might find in exploring the data, at least be clear as to any biases or preferences you have before you start.

Next, when manipulating a visualisation stylistically - ask yourself why you are doing it? Is it to help make the point of the visualisation a little clearer, and is there any risk of your change manipulating the impact of it?

Finally, ask someone else the question "what does this tell you"? If they don't give you the answer you were asking for, then go back to the drawing board.

Do add to your browser favourites also. You never know when you might spot a clanger from someone you know!

And what is the worst data visualisation you've seen? Please post below: