We Don’t Need Another Zero
This is a drawn-out post about formatting, and how to make simple adjustments to improve formatting in axes and tooltips. But I promise it’s worth seeing through to the end, if only for the amazing music video at the footer and the inevitable earworm that accompanies it.
I have a real bugbear about a single aspect of formatting in charts. This isn’t a recent gripe – it’s been going on for years, but now that I’m creating and reviewing more charts, I’m increasingly aware of it. My commitment to write this post was propelled by Mike Cisneros, whose excellent posts of late each use wordplay much more effectively then I ever could in a lifetime. His recent blog “Say Something” is a provocative take on recent online discussions on the virtues of blogging for the sake of it versus having something meaningful to convey in your writing. And its this that has finally teased out of me what I’m writing about now – I need to get it out of my system, but with the intention that it drives change, that people notice it more and address it in their own work.
I’ve struggled to articulate it clearly before however. I started and stopped writing a blog on this topic a few times – and the date actually logged for the first attempt was back in April of last year. This unwritten blog post has been eating away at me every bit as much as the subject matter, onto which I shall clumsily meander now.
It’s about zero.
Not just any zero, though. Zero gets used in many guises, so let me be specific. It’s not the number zero that bothers me, but the wholly unnecessary and wasteful use of it in decimal places. This may seem frivolous, but I would like to use the rest of this post to outline the specific usage I’m concerned with, demonstrate some examples, attempt to provide justification as to why it should be cited and culled, and to give some practical guidance as to how to rid ones self of the scourge.
In reaching for examples where I’ve seen this I didn’t have to look far. Here are some taken from a range of recent vizzes (identities deliberately not shown in order to protect the authors – this isn’t intended to single anyone out):
I promise I’m not trying to pick on any individuals here. Many of these examples came from Zen Masters and from the Viz of The Day gallery. I saw one at the IronViz final at #Data17 and haven’t been able to un-see it since. These authors are among those I have the greatest admiration for. But that serves to underline the fact that nobody is immune to this, and that it’s something that affects even the very best.
Many of these examples show percentages, but I’ve seen it in currencies, temperatures and population figures. They’re most prevalent in axis labels and also in legends.
A Fine Example
So, let’s look at an example in more detail. I use this both with permission and also because it’s taken from one of my favourite recent vizzes, so I get a chance to promote it a little bit too. Emily Chen’s #makeovermonday on life expectancy was one of those that took an already impressive week’s set of community contributions and simply blew them away. It’s here (and the interactive version is on Tableau Public):
Now, there are some subtle differences between this and the original submission, in the annotation layer and the labelling of the axes. To her credit Emily addressed this herself within hours anyway, but take a look at the x-axis on the version she’d initially tweeted:
The labels range from -50.0 to 40.0 at intervals of 10.0. The decimal part here is entirely superfluous. Not only that, but it slows down comprehension. In order to make sense of the scale we need to process that this is displayed in a decimal form yet doesn’t materially need one. To interpret the chart we aren’t doing anything in tenths of a year. So in our heads we realise that the axis is counting up in tens, but then any time we want to cross-refer between the pane and the axis there’s that little shift in comprehension required to adjust for it.
Let’s be clear, this is not the greatest crime in dataviz history, and nor does it result in the reader being misled. But it does hinder effective interpretation, which risks the usefulness being chipped away at, even if just a little bit. And it happens with alarming regularity, and the best in the game are doing it. There’s also the opportunity to claw back some vital chart real estate by borrowing space that was otherwise taken by a steady line of zeros along the axis.
I accept that often imperfections are left behind when we don’t have much time to complete our work, but often when we spend a lot of time tweaking the most minor formatting detail, this can get overlooked. It really doesn’t help that the tools we use to visualise data will often allow this to happen as a default, and this problem really only comes up when the default isn’t edited. Excel does it. Tableau does it. Others will too, I’m sure. Let’s walk through a few more examples of how incredibly easily it occurs.
Back To Basics
Let’s begin with our data in Excel, and whilst we are there examine what treatment Excel gives it. I’ve fabricated a simple dataset with the number of people stating their preference for each of four fruits. In the third column I’ve calculated the proportion, beginning with a decimal and then placing formatting on the cells to represent each figure as a percentage, without decimal places. In the next column is the same data, but now shown to one decimal place.
By default, Excel takes these formats and applies them directly to the axes and to the tooltips. It assumes that the way you have formatted your data is the way that you want it to be displayed in your chart. You can change this of course, but that requires additional formatting steps, which will often be a barrier to the user rattling off their analysis in double-quick time.
In fact, changing this in Excel is simple enough. The axis has number format options, and there’s a text box specifically to control the number of decimal places shown. So, adjusting the axes in the examples above without affecting what’s shown in the tooltip could look like this:
The benefit of this relatively trivial exercise should be apparent enough, but the argument is made easier by asking what benefit is gained by retaining these decimal places on the axis? If there is a justification to be made then that’s great. If there isn’t then the decimal places should be removed.
And if I’m not persuading you enough yet, no less an authority than Cole Knaflic has this to say on the topic:
“One of my biggest pet peeves is trailing zeros on y-axis labels: they carry no informative value, at yet make the numbers look more complicated than they are! Get rid of them, reducing their unnecessary burned on the audience’s cognitive load.”
Okay, so given that this is a Tableau-centric blog, written by Tableauphiles, let’s take the same approach and check out how difficult it is to tidy the formatting in Tableau Desktop. Firstly, here’s the data source view. It’s ignored any formatting I’d placed on the cells in the Excel table, and simply read the data out of the calculation, albeit limited to six decimal places.
Minor Diversion – Formatting Decimals In Tableau
The actual decimal expression for the proportion of Apples (534/1138) comes in at way more than six decimal places, but that’s as much as Tableau reveals. Yet, curiously, when I pull the decimal expression in to the view, it then shows it at four decimal places:
What’s fascinating, however, is that if you force the format to show apparently unlimited decimal places, it will cap the expression. At first I thought this was at 15dp, but on further inspection it’s at 15 digits beginning at the first non-zero digit. Here, Pears are shown to 16dp while the other fruits are to 15dp, and if I create a measure which divided by a further 1,000, these shift to 19dp and 20dp respectively. Huh.
Anyway, we’re getting away from the point.
Whatever, just show me how to fix this in Tableau
So, I have my decimal in Tableau, which I now want to express as a percentage. I might wish to refer to it with some precision in a label or tooltip, but the axis needs to be show 0dp. I’ll create a duplicate of the decimal measure, and set the default format to a percentage with 2dp.
Now if I add that measure to the view, I get this:
Moving the measure on rows I can create this simple bar chart. I’ll also add the same measure (unformatted) to the text label and to the tooltip:
So now let’s suppose that I want to show these independently. The axis at 0dp, the label at 1dp and the tooltip at 2dp. There’s no obvious reason why, but it helps to distinguish the three values in this example. Let’s right-click on the axis and select ‘Format’. Fixing the axis is straightforward, and likewise the label, so long as you’re careful to switch below the ‘Axis’ and ‘Pane’ options:
And when I do that, my axis shows to 0dp and the label to 1dp. BUT WAIT! My tooltip also changed!
Unfortunately, when I changed the axis format, the tooltip format changed along with it. These are linked together, and for as long as I’m using the same measure in both my axis and my tooltip they will format identically. So, what’s the solution? Unfortunately the way round it slightly more work, but absolutely worthwhile. Simply duplicate your measure, and bring that duplicate onto your tooltip. Format this in the same way (being sure to format the ‘pane’ version of it), and then we’re good to go. We have 0dp on the axis, 1dp on the label and 2dp in the tooltip. And it only took half a dozen extra clicks to get there.
Meanwhile, the labels in a legend are linked to those in the ‘pane’ view, so again if you want to show a different format in the text label to the legend, you’ll need a duplicate measure.
Should I Always Remove The Trailing Zeros, Then?
No. Not always. It depends upon whether it aids comprehension to leave them in. For example, if we’re showing a stock price ranged between $9 and $10 with 10 cent intervals, it would be most unusual to display $9.3 on the axis, since we’re not familiar with seeing monetary values in tenths of a dollar, so we show $9.30.
Here’s a good example where it makes sense in the context of the data being used (full Tableau Public link here). Some baseball statistics, for example, are almost always referred to at three decimal places. Showing this at intervals of 0.30, 0.35, 0.40, 0.45 etc. would make little sense.
Ultimately this is a small change for a small but significant gain. But when it comes to bearing the end user in mind, we can never have too many good habits, right?
Right then, you’ve made it to the end. Thank you for persevering! As advertised, here’s Tina: