Line Charts: Where to Start?

Tim Brock / Tuesday, May 5, 2015

I've previously explained that it is essential that the bars of bar charts start at 0. The reasoning is simple: we use relative lengths of bars to compare values, so starting a bar somewhere else leads to false judgements. But what about line charts?

Below is a line chart with three datasets: A, B and C. We can see that:

  1. all lines are well above zero across all the years;
  2. A is roughly flat;
  3. B trends downward with a jump in the mid 1980's;
  4. C trends upwards.

Only point 1 above is enhanced by starting the y axis at 0. If we care more for trends, gradients, and the size of noise then focusing our chart around the area that actually contains data (as below) will help us to see these aspects at an improved resolution. That's true whether we're looking at different sections of one line or comparing across multiple lines.

With this improved resolution we can now see just how big the jump in the mid 1980's is for B - it's a change of 3 or 4 in Value in a single year. We can see that the upward trend in C isn't present in the early years. There might even be a hint that A trends ever so slightly upwards too. Further, while a table is the best option for displaying very precise information, this second chart is still an improvement on the first when it comes to accurately estimating values for a given year.

I've tried to make the case that it isn't generally necessary to include 0 on the vertical axis of a line chart and that there are frequently advantages to not doing so. Nevertheless, it can be useful to guide your audience away from making the assumption that the y axis does start at 0. The chart below illustrates a potential issue.

The problem with this chart is the visual metaphor of line D crashing to the bottom. Of course if the y axis started at 0 this wouldn't be a problem. But we don't need to extend our axis that far to reduce the salience of the misleading metaphor; even a little extension helps.

However, D is still fast approaching the dark(er) horizontal axis at the bottom. While the axis lines provide convenient separators between chart area and labels, they're not strictly necessary. So we can remove the x-axis line and tick marks without any loss in meaning.

Still, the labels themselves could be seen as an indicator of line D's fast approach to the bottom. Why not move them to the top?

We could probably stop there. But I like experimenting. The final change I'm going to make to this chart is more of a novel, but subtle, experiment. Rather than simply suppress one visual metaphor - the line crashing in to the axis at the bottom - we'll attempts to replace it with another. By fading away the bottom of the chart area we'll try to convey the idea that the vertical scale actually continues on downwards into the distance.

Is this last change helpful, a hindrance or neither? I'm not sure. I don't think it's particularly straightforward to implement in most charting software. Hence, one of Colin Ware's guideline for information visualization (Colin Ware, Information Visualization, Third Edition, page 24) seems relevant: "Consider adopting novel design solutions only when the estimated payoff is substantially greater than the cost of learning to use them."

So far the discussion has been centered entirely around modifying the vertical scale. The horizontal extent of the datasets has been ignored or it has been implicitly assumed that what's been visible is all there is. Frequently time series are cropped in the horizontal direction. This may seem like a dubious activity but is frequently just used as a means of increasing resolution over a specific period of interest. In the latter case this is exactly the same benefit that we saw above from reducing the vertical axis. There is, however, a notable difference. Reducing the vertical extent of a line chart will generally only reduce the whitespace. Cropping the horizontal axis reduces whitespace and removes data from view. For that reason, when you first see a line chart you have reason to distrust, perhaps the first question to ask is "Why does the x axis start there?" and not "Why doesn't the y axis start at/include 0"? Of course, when you're making your own charts you should ask yourself both of these questions.