When Should We Use A Slope Graph?
Slope graphs are becoming more commonplace. Not only are they becoming more popular in the data viz community but also making more frequent appearances in the media. I had previously only ever considered them as simplistic line charts, but for a while they had become fashionable as a chart choice for Makeover Monday, and I happened upon Andy Cotgreave’s reference to them in his talk at the 2016 Conference, New Ways To Visualise Time.
In Andy’s talk he takes a number of different approaches to demonstrate depictions of activity over time, and one of these was a slope graph. In Andy’s own words, he could “grab all the data marks that are not at the start and the end, and exclude them”, and has written a blogpost which goes into further detail, with example calculated fields to help make the approach more interactive along the way. Until recently, this was my understanding of the purpose of a slope graph – to show change over time between just two fixed points (the start and the end). And then I saw this, on the FT’s Data twitter feed:
This was a new take on the slope graph, and one I wasn’t particularly comfortable with. This wasn’t depicting a change over time, but was showing a difference between different population sets. It also showed axes which didn’t begin at zero, and appeared to break what I understood to be a golden rule in that you can only truncate the y-axis if either zero was a practical impossibility or you are trying to show very nuanced degrees of change, such as stock prices.
So, either the chart is not following good practice, or I’m not up on slope graphs. So, as one does, I took to twitter to consult those who would know better. Andy Kirk was both quick and kind enough to respond:
I’m ok with them Mark. Don’t need zero origin for slope graphs and they can be simply to connect comparable series rather than 2-pts in time
— Andy Kirk (@visualisingdata) September 23, 2017
Chris Love shared Andy’s view, and Cole Knafflic suggested that familiarity had a role to play:
Familiarity & accessibility are additional considerations—many aren't as comfortable with dot/box/whisker vs. lines (not intimidating!).
— Cole Knaflic (@storywithdata) September 23, 2017
But I still wasn’t convinced here – I wanted some clearer justification, but at this point I was going to get more out of researching the topic rather than pestering experts on the subject. Mostly I was struggling with the notion of diagonal connections between the marks – one population being a subset of the other meant that they were from the some original data pool, but the areas with most refugees were effectively being singled out. Additionally, the difference between the two seems to be of key importance to the message being conveyed, but by expressing this with a diagonal line in two dimensions the magnitude has been lost. In the unemployment chart on the left, the percentage point difference in the East is 0.4 whilst in the West it is 1.6 – four times greater. Here however, the pink line is only around 30% longer than the blue line. This, along with the missing 0% reference point, can’t be right. Cole also pointed out that the two graphs use different scales which may not be immediately obvious.
I started by trying to look at what I felt would be better ways to show that data, my instinct being that showing differences in one dimension (vertical or horizontal) was more appropriate than two. Additionally, the source material tells us that the areas with most refugees represent 10% of the regional comparator, which we can use as a sizing tool if appropriate, and the nature of having East and West in our data gives the opportunity to present the regions in an appropriate order.
I’m relatively comfortable that this is clearer overall than the slope graph, but perhaps not nearly as stimulating. But I still hadn’t uncovered more complete advice on the whens and the whys of slope graph usage. I found some good blogs on the topic, and ploughed through them. In the following paragraphs, I play them out chronologically, and I would urge you to click through to each of the links and read any articles you haven’t previously. If you only have time to read one, make it the first.
A Potted History Of The Slope Graph
Some six years ago, Charlie Park wrote this blog post, covering a potted history of the slope graph to that point. He credits Edward Tufte with the origins but also cites some non-Tufte examples. Critically, those examples aren’t analyses of change over time – they cover a range of use cases. He also followed that blog up with another five months later, updating with further use cases he’d come across, and also comparable chart types. In both of Charlie’s articles he provides guidance of best practices.
A couple of years on, Andy Kirk wrote this blog post declaring his passion for them and Ben Jones followed up by taking Andy’s Excel creation and providing a step-by-step as to how to reproduce it in Tableau.
In 2014, Cole Knafflic wrote her own article, and the first example she uses mirrors that usage by the FT for its German immigrants viz, comparing a subset of the population with its whole. Articulately as ever, she emphasises that connecting the points make sense (since they have a relationship) and visually display the relative difference between them. Her examples use many more data points than the FT’s two, and uses colour to good effect to call out those lines which buck trends, for example.
After reading through and immersing myself in slope graphs, I then almost couldn’t stop seeing examples where it might be a better chart choice. Here’s an example from a recent Guardian article about the rise of working mothers in the UK.
Here we are being encouraged to compare the proportions of working women in nine occupation groups, between those with and without children. The key message is that “a quarter of working women with children are in professional occupations”, compared with just under 20% of those without children. The intention is to highlight the disparity within professional occupations, which a slope chart would do much more effectively, something like this:
Once you get over the fact that my own quick-and-dirty design skills are not up to scratch, it’s easy to identify some key advantages to this approach:
– one line on the slope graph accounts for two bars on the bar chart
– the reader is very quickly drawn to those with lower values (to do so with bars would require a third colour or a use of a bullet chart)
I’m now converted. I would still contest the use of slope graphs depicting subsets versus the whole, but certainly accept that they’re visually clean and simple ways to show differences between two elements which hold a relationship. Preferably the two elements should be on the same scale, but there remain cases where even then they remain a viable chart choice. However, as Charlie Park usefully reminds us, we should always be careful that we are not trying to introduce meaning where is doesn’t exist.
In summary, and to answer the question this blog post asks, we can opt to use Slope Graphs to show the differences between two related points, be they changes over time or changes in formation of a group (e.g. part vs whole). My own bias against those that didn’t strictly compare changes over time were simply rooted in the fact that this was my introduction to slope graphs, and I’d anchored them to this use case. No longer!
All hail the slope graph!
Featured image taken from pxhere.com under the Creative Commons License.