Stacked Area Graph

definition - mistake - related - code

Definition


A stacked area chart is the extension of a basic area chart. It displays the evolution of the value of several groups on the same graphic. The values of each group are displayed on top of each other, what allows to check on the same figure the evolution of both the total of a numeric variable, and the importance of each group.


The following example shows the evolution of baby name frequencies in the US between 1880 and 2015.

# Libraries
library(tidyverse)
library(babynames)
library(streamgraph)
library(viridis)
library(hrbrthemes)
library(plotly)

# Load dataset from github
data <- babynames %>% 
  filter(name %in% c("Ashley", "Amanda", "Jessica",    "Patricia", "Linda", "Deborah",   "Dorothy", "Betty", "Helen")) %>%
  filter(sex=="F")

# Plot
p <- data %>% 
  ggplot( aes(x=year, y=n, fill=name, text=name)) +
    geom_area( ) +
    scale_fill_viridis(discrete = TRUE) +
    theme(legend.position="none") +
    ggtitle("Popularity of American names in the previous 30 years") +
    theme_ipsum() +
    theme(legend.position="none")
ggplotly(p, tooltip="text")

Note: This graphic does not have a legend since it is interactive. Hover a group to get its name. The dataset is available through the babynames R library and a .csv version is available on github.

What for


The efficiency of stacked area graph is discussed and it must be used with care. To put it in a nutshell:

This website dedicates a whole page about stacking and its potential pitfalls, visit it to go further.

Variation


A variation of the stacked area graph is the percent stacked area graph. It is the same thing but value of each group are normalized at each time stamp. That allows to study the percentage of each group in the whole more efficiently:

p <- data %>% 
  # Compute the proportions:
  group_by(year) %>%
  mutate(freq = n / sum(n)) %>%
  ungroup() %>%
  
  # Plot
  ggplot( aes(x=year, y=freq, fill=name, color=name, text=name)) +
    geom_area(  ) +
    scale_fill_viridis(discrete = TRUE) +
    scale_color_viridis(discrete = TRUE) +
    theme(legend.position="none") +
    ggtitle("Popularity of American names in the previous 30 years") +
    theme_ipsum() +
    theme(legend.position="none")
ggplotly(p, tooltip="text")

Common caveats


Related


Build your own


The R and Python graph galleries are 2 websites providing hundreds of chart example, always providing the reproducible code. Click the button below to see how to build the chart you need with your favorite programing language.

R graph gallery Python gallery

Comments


Any thoughts on this? Found any mistake? Disagree? Please drop me a word on twitter or in the comment section below:

 

A work by Yan Holtz for data-to-viz.com