The biggest UK cities

A few data analytics ideas from

This document gives a few suggestions to analyse a dataset composed by a set of geographic coordinates that have an associated numeric value.

It considers the population of 925 cities in the UK. This example dataset is provided in the R maps library and is available on this Github repository. Basically it looks like the table beside.

# Libraries
options(knitr.table.format = "html")

# Load dataset from github
data <- read.table("", header=T)

# show data
data %>% head(5) %>% kable() %>%
  kable_styling(bootstrap_options = "striped", full_width = F)
lat long pop name
51.65 -3.14 10146 Abercarn-Newbridge
51.72 -3.46 33048 Aberdare
57.15 -2.10 184031 Aberdeen
51.83 -3.02 14251 Abergavenny
53.28 -3.58 17819 Abergele

Bubble map

The go-to graphic in this kind of situation is probably the bubble map. Basically, one circle is drawn per provided geographic position. The size of the bubble is proportional to its corresponding numeric value. Warning, remember that the bubble area must be proportional to the value, not its radius.

# Get the world polygon and extract UK
UK <- map_data("world") %>% filter(region=="UK")

# Easy to make it interactive!
# plot
p=data %>%
  arrange(desc(pop)) %>%
  mutate( name=factor(name, unique(name))) %>%
  mutate( mytext=paste("City: ", name, "\n", "Population: ", pop, sep="")) %>%  # This prepare the text displayed on hover.
  # Makte the static plot calling this text:
  ggplot() +
    ggplot2::annotate("text", x = 1, y = 56.3, label="1000 biggest cities in the UK", colour = "black", size=4, alpha=1) +
    ggplot2::annotate("segment", x = -1, xend = 3, y = 56, yend = 56, colour = "black", size=0.2, alpha=1) +
    geom_polygon(data = UK, aes(x=long, y = lat, group = group), fill="grey", alpha=0.3) +
    geom_point(aes(x=long, y=lat, size=pop, color=pop, text=mytext, alpha=pop) ) +
    scale_size_continuous(range=c(0.3,12)) +
    scale_color_viridis(option="inferno", trans="log" ) +
    scale_alpha_continuous(trans="log") +
    theme_void() +
    ylim(50,59) +
    coord_map() +
      legend.position = "none"

p <- ggplotly(p, tooltip="text")

Note: this map is interactive. Hover a bubble to get the city name and its population, Zoom on a specific part for more details.

Going further

You can learn more about each type of graphic presented in this story in the dedicated sections. Click the icon below:


Any thoughts on this? Found any mistake? Have another way to show the data? Please drop me a word on Twitter or in the comment section below:


A work by Yan Holtz for