Streamgraph

definition - mistake - related - code

Definition


A Stream graph is a type of stacked area chart. It displays the evolution of a numeric value (Y axis) following another numeric value (X axis). This evolution is represented for several groups, all with a distinct color.

Contrary to a stacked area, there is no corner: edges are rounded what gives this nice impression of flow. Moreover, areas are usually displaced around a central axis, resulting in a flowing and organic shape.


The following example shows the evolution of baby name frequencies in the US between 1880 and 2015.

# Libraries
library(tidyverse)
library(babynames)
library(streamgraph)


# Load dataset from github
data <- babynames %>%
  filter(name %in% c("Ashley", "Amanda", "Jessica",    "Patricia", "Linda", "Deborah",   "Dorothy", "Betty", "Helen")) %>%
  filter(sex=="F")

# Plot
data %>%
  streamgraph(key="name", value="n", date="year") %>%
  sg_fill_brewer("BuPu")

Note: The dataset is available through the babynames R library and a .csv version is available on github.

What for


Streamchart are good to study the relative proportions of the whole. However they are bad to study the evolution of each individual group: it is very hard to substract the height of other groups at each time point. For a more accurate but less attractive figure, consider a line chart or area chart using small multiple.

Stream chart gets really useful when displayed in an interactive mode: highlighting a group gives you directly an insight of its evolution.

Variation


Even if areas are usually displaced around a central axis, it is possible to display them as for most of the chart type: over the 0 axis.

# Plot
data %>%
  streamgraph(key="name", value="n", date="year", offset="zero") %>%
  sg_fill_brewer("BuPu")



It also possible to create a percent streamchart where the proportion of each group is displayed instead of its absolute value. Here, the total number of baby names is not available anymore. However, the most popular names at each period gets more obvious.

# Plot
data %>%
  streamgraph(key="name", value="n", date="year", offset="expand") %>%
  sg_fill_brewer("BuPu")

Common caveats


  • In my opinion, streamgraph work well when there is a clear pattern in the data. If the proportion of each group remain more or less the same all along the time frame, the figure won’t be very insightfull since small variations will be hard to read.

Build your own


The R, Python, React and D3 graph galleries are 4 websites providing hundreds of chart example, always providing the reproducible code. Click the button below to see how to build the chart you need with your favorite programing language.

R graph gallery Python gallery React gallery D3 gallery

Dataviz decision tree

Data To Viz is a comprehensive classification of chart types organized by data input format. Get a high-resolution version of our decision tree delivered to your inbox now!


High Resolution Poster
 

A work by Yan Holtz for data-to-viz.com