Scaling to radius or area?

A collection of common dataviz caveats by Data-to-Viz.com




A usual practice in data visualization consists of scaling a graphic component to a numeric value. For instance, bar lengths are scaled to values in a barplot. When the value is scaled to an area, scaling it by two sides at the same time produces a quadratic effect for a linear change. This amplifies the differences.

Example


Here is an example coming from the Barack Obama’s State of the Union speech in 2011. It shows the 2010 Gross Domestic Product of 5 countries, each value being represented by a circle. The radius of each circle has been scaled based on the size of each nation’s economy.



img
Source: The 2011 State of the Union Address



Here, America’s economy looks way bigger than the others.

Correction


This distorts the perception of the relative sizes of the circles since the radius scales linearly, but the area scales quadratically. Basically, the size of the United States economy appears much bigger than it should.


Here is a corrected version released on the Fast Fedora blog where circle size is scaled to the area instead:



img



The United States still has the biggest economy, but the difference from other countries does not appear as big as it did on the first graphic.

Conclusion


When working with 2d objects, the scaling must be done using the area and not the radius. Furthermore, note that areas are a poor metaphor of values, being poorly perceived by human eyes. It must be used only when better visuals have already been used on the graphic (like in bubble plot). In this case, a barplot would probably have done a better job.

Going further


Comments


Any thoughts on this? Found any mistake? Disagree? Please drop me a word on twitter or in the comment section below:

 

A work by Yan Holtz for data-to-viz.com