Mental arithmetic in dataviz

A collection of common dataviz caveats by Data-to-Viz.com




Example


Let’s consider the number of people entering (red curve) and leaving (blue curve) a shop from 8am to 10pm. This is an accurate representation using a line plot, that answers very well the question of how many people are entering / leaving the shop.


# Libraries
library(tidyverse)
library(hrbrthemes)

# Create data
data <- data.frame(
  x = seq(8,20,0.5),
  Entering = c(20,22,19,24,28,29,26,32,34,37,33,34,30,28,29,30,27,21,19,21,17,13,15,12,9),
  Leaving = c(0,4,8,7,10,13,15,16,15,16,17,19,22,21,24,26,24,25,28,29,28,26,23,20,19)
)

# reformat
data %>%
  gather( key=type, value=value, -1) %>%
  ggplot( aes(x=x, y=value, color=type)) +
    geom_line() +
    ylim(0,40) +
    scale_color_discrete(name="") +
    scale_x_continuous(breaks=seq(8,20,1)) +
    annotate( "text", x=c(12.5, 16.3, 17.5), y=c(39, 27, 31), label=LETTERS[1:3] ) +
    theme_ipsum() +
    theme(
      panel.grid.minor = element_blank(),
      legend.position = c(0.9, 0.9),
    ) +
    ylab("# of people") + 
    xlab("Hour of day")