Grouped barplot must be grouped

A collection of common dataviz caveats by Data-to-Viz.com




The issue


A barplot displays the value of a numeric variable for several entities. These entities can be grouped using a categoric variable, resulting in a grouped barplot.


When building a grouped barplot, make sure that your bars are indeed grouped: bars within a group must be closer to one another than bars in different groups. In the example below, bars are not grouped, making hard to distinguish groups.

# Libraries
library(tidyverse)
library(hrbrthemes)
library(babynames)
library(viridis)

# Load dataset from github
data <- babynames %>% 
  filter(name %in% c("Anna", "Mary")) %>%
  filter(sex=="F")

# A grouped barplot
data  %>% 
  filter(year %in% c(1950, 1960, 1970, 1980, 1990, 2000)) %>%
  mutate(year=as.factor(year)) %>%
  mutate( nameYear = paste(year, name, sep=" - ")) %>%
  ggplot( aes(x=as.factor(nameYear), y=n, fill=name)) +
    geom_bar(stat="identity") +
    scale_fill_viridis(discrete=TRUE, name="") +
    theme_ipsum() +
    ylab("Number of baby") +
    xlab("") +
    theme(
      axis.text.x = element_text(angle=60, hjust=1)
    )

Solving the issue


The workaround is pretty simple, just group your bars:

# Libraries
library(tidyverse)
library(hrbrthemes)
library(babynames)
library(viridis)

# Load dataset from github
data <- babynames %>% 
  filter(name %in% c("Anna", "Mary")) %>%
  filter(sex=="F")

# A grouped barplot
data  %>% 
  filter(year %in% c(1950, 1960, 1970, 1980, 1990, 2000)) %>%
  mutate(year=as.factor(year)) %>%
  ggplot( aes(x=year, y=n, fill=name)) +
    geom_bar(stat="identity", position="dodge") +
    scale_fill_viridis(discrete=TRUE, name="") +
    theme_ipsum() +
    ylab("Number of baby")

Comments


Any thoughts on this? Found any mistake? Disagree? Please drop me a word on twitter or in the comment section below:

 

A work by Yan Holtz for data-to-viz.com