How to Improve Your Data Visualizations with Annotations

Tim Brock / Tuesday, October 27, 2015

Merriam Webster defines an annotation as "a note added to a text, book, drawing, etc., as a comment or explanation". Chart annotations can provide extra detail, highlight points of interest or simply be used for disambiguation purposes. However, filling a graphic with annotations can distract from the visual salience of the data itself, so it's important to find the right balance. If we say that a charts title, axis labels and axes titles are structural components rather than annotations then it's probably safe to conclude that many charts don't require annotation.

Straddling the line between structural component and annotation are line and marker labels. Where possible, I try to label lines directly. In the example below the labels are at the end, for a line illustrating a distribution I'd be more inclined to place the label centered above the maximum. Either way, this reduces the need of the viewer to switch gaze from line to legend and back and requires no form of color identification task which can be awkward for those who suffer from a color vision deficiency (color blindness). If the label names are small then direct labeling will use up much less additional space (there's no need to draw a line for each label in the key for instance) and for a fixed image size we have more room to display our data.

In multi-category scatter plots (like below) it's too much to directly label every point and can be confusing to label only one per category. With lines that frequently cross, neat and unambiguous labeling of each line can be difficult (though matching label color to line color may help), especially if trying to automate the chart creation process. Consequently, direct labelling isn't always a practical solution and a discrete legend may be required.

As hinted at in the first paragraph, annotations can also be used to provide specific details about individual points, clusters of points and line segments. They can even be used to explain the empty spaces — sometimes the biggest insights one can get from a chart come from understanding where the data points aren't! Knowing what to label, where to place the labels and how they should appear may not be entirely obvious and is frequently a matter of trial and error. As always, context is important.

One frequent use of annotation is to print the values of bars in a bar chart as in the example below. This can be helpful but shouldn't be thought of as necessary - precise values are best displayed in an accompanying table.

As I (hopefully), demonstrated in my recent article, connected scatterplots benefit greatly from annotation in two ways. Firstly we can label the points with the value for the third variable we're interested, the one that isn't plotted along either axis (usually this means time). Secondly, specific anomalies or points of interest can be explained through more extensive text. Here's the final chart from that article again.

Note that the annotations use the same simple typeface (Helvetica) as the axis labels. There's no reason to use fancy fonts or vibrant colors, as in the remake below, that only act as a distraction and make reading difficult.

Connected scatter plots generally work well with large amounts of annotation that help to tell a specific, evolving, story. However, sometimes the story in a chart really only concerns a single data point. The rest of the data is there to provide context and illustrate the anomalous nature of that one point. The annotation can give it focus as well as providing you with a platform to communicate further. Sometimes you'll be able to explain the cause of the outlier, on other occasions you might have to explain that you have no clue what is going on. Both cases are interesting but it's important your audience knows what you're trying to say: "This anomaly can be explained by..." or "I cannot explain this anomaly (HELP!?!)". The chart below illustrates the former (you can read more about the data here).

If a chart forms some part of a larger article then your probably don't want to annotate your graphics with blocks of text that just repeat what has been said in the blocks of text that surround the chart. However, thanks to Twitter and other social media, charts are now frequently shared without the surrounding text of the article. This provides at least some motivation for moving text from the surrounding article in to the chart. In this sense, perhaps the "rules" for choosing which annotations should be added to a chart are evolving, like the story in a connected scatter plot.

Infragistics Ultimate 15.2 is here. Download and see its power in action!