How (Not) to Lie with Data Visualization
Visualization can be a shortcut to insight or a trapdoor into falsehood.
Big data analytics and data visualization have been flourishing hand in hand over the past decade. Data analytics technology allows us to derive insight from very large, dynamic and ambiguous data. And data visualization allows us to visually communicate the story told by the data, which helps us make the insights truly actionable. But we can only truly reap the benefits of data visualization if we use it honestly.
Getting Big Value from Big Data
Data visualization is important because data is of little value if it does not lead to business-critical insights and insight is of little value if it cannot be intuitively perceived by data analysts and communicated to decision makers in a compelling manner. And that’s why so much enterprise software presents data graphically via dashboards, icons, graphs, charts and other (often interactive) visualizations.
A well-defined visual language helps users to understand, analyze and discover multiple complex aspects of their data. However, defining this language is not a trivial task and poor communication can mislead customers, destroy trust and undersell the power of data analytics. This creates a tension between using understandable visual representations and communicating the complexity of the data behind them.
So, You Think Data Never Lies?
To make honest, accurate and actionable use of data visualization, it is crucial to accept that this is a tension that can never be fully resolved. Maybe raw data never lies but we can’t really use raw data. To make data actionable, we need to present it in a simplified (and therefore incomplete) form. Data visualization always lies a bit. The important thing is to make sure those little white lies are as little and as white as possible.
Unfortunately, data visualizations are often used specifically to distort the truth. While it’s inevitable that we introduce distortion by summarizing, classifying, sorting and filtering data in our visualizations, we must be cognizant of keeping our eyes on the prize—honestly communicating the key truths hiding among all that data in a way that allows us to make business decisions that will have a clear, measurable impact on the bottom line.
It is data’s hidden truths that create business value. If we use data visualization to lie, we effectively remove the business value from the raw data.
How Data Lies
So, how do you lie with data visualization?
Take this 3D pie chart, for example. To many data visualization experts, a pie chart is already something of a villain but put it into three dimensions and things get worse. This pie chart is very difficult to read accurately because the perspective gives the impression that the grey pie slice is larger than it is (half the size of the blue slice). Even if we added figures, many readers would rely on their initial perceptions of the graph.
You don’t need fancy tricks with perspective to distort perceptions. Something as simple as a line chart can be misused in multiple ways. For example, the perception of a line chart changes dramatically depending on the values used on the vertical axis. The chart may seem to show no significant changes in values if it has no obvious peaks. Extend the range of possible values though and clear peaks may emerge.
Distortions like these can be produced less through a desire to mislead than through a lack of visual literacy. Users of analytics software may have good intentions but without a thorough knowledge of how visualization works they may still create false impressions. Likewise, software vendors must create systems that make it simple for all users to produce visualizations which enable accurate insights.