A heatmap
is a graphical representation of data where
the individual values contained in a matrix are
represented as colors. It is a bit like looking a data table from
above.
Here is an example showing 8 general features like population or life expectancy for about 30 countries in 2015. Data come from the French National Institute of Demographic Studies.
# Libraries
library(tidyverse)
library(hrbrthemes)
library(viridis)
library(plotly)
# d3heatmap is not on CRAN yet, but can be found here: https://github.com/talgalili/d3heatmap
library(d3heatmap)
# Load data
data <- read.table("https://raw.githubusercontent.com/holtzy/data_to_viz/master/Example_dataset/multivariate.csv", header = T, sep = ";")
colnames(data) <- gsub("\\.", " ", colnames(data))
# Select a few country
data <- data %>%
filter(Country %in% c("France", "Sweden", "Italy", "Spain", "England", "Portugal", "Greece", "Peru", "Chile", "Brazil", "Argentina", "Bolivia", "Venezuela", "Australia", "New Zealand", "Fiji", "China", "India", "Thailand", "Afghanistan", "Bangladesh", "United States of America", "Canada", "Burundi", "Angola", "Kenya", "Togo")) %>%
arrange(Country) %>%
mutate(Country = factor(Country, Country))
# Matrix format
mat <- data
rownames(mat) <- mat[,1]
mat <- mat %>% dplyr::select(-Country, -Group, -Continent)
mat <- as.matrix(mat)
# Heatmap
#d3heatmap(mat, scale="column", dendrogram = "none", width="800px", height="80Opx", colors = "Blues")
library(heatmaply)
p <- heatmaply(mat,
dendrogram = "none",
xlab = "", ylab = "",
main = "",
scale = "column",
margins = c(60,100,40,20),
grid_color = "white",
grid_width = 0.00001,
titleX = FALSE,
hide_colorbar = TRUE,
branches_lwd = 0.1,
label_names = c("Country", "Feature:", "Value"),
fontsize_row = 5, fontsize_col = 5,
labCol = colnames(mat),
labRow = rownames(mat),
heatmap_layers = theme(axis.line = element_blank())
)
Note: You can learn more about this dataset and how to visualize it in the dedicated page
A heatmap is really useful to display a general view
of
numerical data, not to extract specific data point. In
the graphic above, the huge population size of China and India pops out
for example.
A heatmap is also useful to display the result of
hierarchical clustering
. Basically, clustering checks which
countries tend to have the same features on their numeric variables, and
therefore which countries are similar. The usual way to
represent the result is to use dendrograms.
This type of chart can be drawn around the heatmap:
p <- heatmaply(mat,
#dendrogram = "row",
xlab = "", ylab = "",
main = "",
scale = "column",
margins = c(60,100,40,20),
grid_color = "white",
grid_width = 0.00001,
titleX = FALSE,
hide_colorbar = TRUE,
branches_lwd = 0.1,
label_names = c("Country", "Feature:", "Value"),
fontsize_row = 5, fontsize_col = 5,
labCol = colnames(mat),
labRow = rownames(mat),
heatmap_layers = theme(axis.line=element_blank())
)
# save the widget
# library(htmlwidgets)
# saveWidget(p, file= "~/Desktop/R-graph-gallery/HtmlWidget/heatmapInter.html")
Here, Burundi and Angola are grouped together. Indeed they are two countries in strong expansion, with a lot of children per woman but still a strong mortality rate.
Note: in this heatmap, features are also clusterised. For instance, birth rate and children per woman are grouped together since they are highly correlated.
Note: hierarchical clustering is a complex statistical method. You can learn more about it here.
We’ve seen in the previous section that a heatmap is often used
to display the result of a clustering algorithm. A common task is to
compare the result with expectations. For instance, we can check if the
countries are clustering according to their continent using a
color bar
.
For a static heatmap, a common practice is to display the exact value of each cell in numbers. Indeed, it is hard to translate a color into a precise number.
Heatmaps can also be used for time series where there is a regular pattern in time.
Heatmaps can be applied to adjacency matrices.
Data To Viz is a comprehensive classification of chart types organized by data input format. Get a high-resolution version of our decision tree delivered to your inbox now!
A work by Yan Holtz for data-to-viz.com