Why you should order your data

A collection of common dataviz caveats by Data-to-Viz.com




By default, most of the data visualization tools will order the groups of your categorical variables using alphabetical order, or using the order of appearance in your input table. It is good practice to think about this order since changing it can add a lot of insight to your graphic.

Unordered lollipop plot


Let’s start with a lollipop plot showing the quantity of weapons sold by a few countries. Here each row represents a country and the X-axis shows how many weapons have been sold in 2017. Countries are ordered in alphabetical order by default.

# Libraries
library(tidyverse)
library(hrbrthemes)
library(kableExtra)
options(knitr.table.format = "html")

# Load dataset from github
data <- read.table("https://raw.githubusercontent.com/holtzy/data_to_viz/master/Example_dataset/7_OneCatOneNum.csv", header=TRUE, sep=",")

# Plot 
data %>%
  filter(!is.na(Value)) %>%
  ggplot( aes(x=Country, y=Value) ) +
    geom_segment( aes(x=Country ,xend=Country, y=0, yend=Value), color="grey") +
    geom_point(size=3, color="#69b3a2") +
    coord_flip() +
    theme_ipsum() +
    theme(
      panel.grid.minor.y = element_blank(),
      panel.grid.major.y = element_blank(),
      legend.position="none"
    ) +
    xlab("")