if (!require("pacman")) install.packages("pacman")
::p_load_gh("runapp-aus/strayr")
pacman::p_load_gh("yutannihilation/ggsflabel")
pacman::p_load(tidyverse, scales) pacman
Creating Custom Maps of Australia in R: A Step-by-Step Guide
Introduction
Maps allow us to easily convey spatial information. In this tutorial, I’ll show you an easy way to create maps of Australia in R and how to add additional information to the maps using colors and labels.
Load required packages:
Map of Australia
Figure 1 shows a map of Australia, with greater capital city areas labeled and coloured according to the population.
Let me show you how this map was created.
Accessing Map Data
Hard way
To create a map, we first need to get access to the underlying map data in ‘ESRI Shapefile format’.
The Australian Bureau of Statistics (ABS) update their map geographies regularly and maps can be downloaded in Shapefile format here. For our purpose, we will be using the Greater Capital City Statistical Areas - 2021 - Shapefile.
Easy way
There is a user-generated package called absmapsdata
which makes it much easier to access ABS map data in a ready to use format.
You can download the entire absmapsdata
package with:
::install_github("wfmackey/absmapsdata") remotes
Be warned, the absmapsdata
package contains a lot of data!
To avoid a large download, you can use the use strayr::read_absmap
function from the aptly names strayr
package to access specific maps from absmapsdata
, without installing the whole absmapsdata
package.
First, install the the strayr
package:
::install_github("runapp-aus/strayr") remotes
Then, use the strayr::read_absmap
function to read the Greater Capital City Statistical Areas map object from absmapsdata
:
<-
gcc_map ::read_absmap("gcc2021") strayr
The result is a dataframe
with a special column called geometry
which contains the map parameters (MULTIPOLYGON Object) required to build the map:
# A tibble: 5 × 8
gcc_code_2021 gcc_name_2021 state_code_2021 state_name_2021 areasqkm_2021
<chr> <chr> <chr> <chr> <dbl>
1 1GSYD Greater Sydney 1 New South Wales 12369.
2 1RNSW Rest of NSW 1 New South Wales 788429.
3 19499 No usual address … 1 New South Wales NA
4 19799 Migratory - Offsh… 1 New South Wales NA
5 2GMEL Greater Melbourne 2 Victoria 9993.
# ℹ 3 more variables: cent_lat <dbl>, cent_long <dbl>,
# geometry <MULTIPOLYGON [°]>
Plotting the Map
We can plot the map with the ggplot2
package’s geom_sf() function
:
%>%
gcc_map ::ggplot() +
ggplot2::geom_sf() ggplot2
Now, let’s clean up the map and add some colour:
%>%
gcc_map ::filter(!is.na(areasqkm_2021)) %>%
dplyr::filter(!state_name_2021 == "Other Territories") %>%
dplyr::ggplot() +
ggplot2::geom_sf(fill = "brown", colour = "black") +
ggplot2::theme_void() ggplot2
Now let’s add some labels so we can identify the greater capital city areas.
To do this we will use the ggsflabel
package:
%>%
gcc_map ::filter(!is.na(areasqkm_2021)) %>%
dplyr::filter(!state_name_2021 == "Other Territories") %>%
dplyr::ggplot() +
ggplot2::geom_sf(aes(geometry = geometry),
ggplot2fill = "brown",
colour = "black") +
::geom_sf_label_repel(aes(label = gcc_name_2021),
ggsflabelsize = 3) +
::theme_void() ggplot2
Highlight map regions
What if we wanted to quickly identify the size of each greater capital city by area?
Rather than colouring the entire map, we can colour each greater capital city map region based on how big it is in area (km2).
To do this, we replace the geom_sf()
‘fill’ parameter with the areasqkm variable:
%>%
gcc_map ::filter(!is.na(areasqkm_2021)) %>%
dplyr::filter(!state_name_2021 == "Other Territories") %>%
dplyr::ggplot() +
ggplot2::geom_sf(aes(geometry = geometry,
ggplot2fill = areasqkm_2021),
colour = "black") +
::geom_sf_label_repel(aes(label = gcc_name_2021),
ggsflabelsize = 3) +
::theme_void() ggplot2
OK. Not bad.
Let’s tidy the map by fixing the legend with the scales
package and adjusting the colour palette:
%>%
gcc_map ::filter(!is.na(areasqkm_2021)) %>%
dplyr::filter(!state_name_2021 == "Other Territories") %>%
dplyr::ggplot() +
ggplot2::geom_sf(aes(geometry = geometry,
ggplot2fill = areasqkm_2021),
colour = "black") +
::geom_sf_label_repel(aes(label = gcc_name_2021),
ggsflabelsize = 3) +
::scale_fill_gradient2( low = "red",
ggplot2mid = "orange",
high = "steelblue",
na.value = NA,
name = "Area (Km2)",
labels = scales::comma) +
::theme_void() ggplot2
This is looking pretty good and makes it easy to scan the map and see that “Rest of WA” is the largest region by area.
Mapping External Data
A major benefit of maps is that you can visualise variables across regions, such as population or income.
In R, this involves joining two data sets together based on a common field.
Lets create the greater capital city area map again, this time colouring the regions by population (as shown in Figure 1).
First we need to download population data from the ABS here.
Then we import the population data as a dataframe
and tidy it up into a usable format:
# A tibble: 6 × 2
gcc_name population_2022_gcc
<chr> <dbl>
1 Rest of NSW 2862995
2 Greater Sydney 5302736
3 Rest of Vic. 1590226
4 Greater Melbourne 5035738
5 Greater Brisbane 2625341
6 Rest of Qld 2695155
Now we merge the population data with the greater capital city area map data from earlier:
#remove "2021" from all column names of gcc map object
#merge the datasets based on a common variable "gcc_name"
<-
gcc_map_merge %>%
gcc_map ::rename_all(., ~stringr::str_replace(., "_2021", "")) %>%
dplyr::left_join(., data_population_aus %>%
dplyr::select(gcc_name, population_2022_gcc) %>%
dplyr::distinct(), by = c('gcc_name')) %>%
dplyr::relocate(population_2022_gcc, .after= gcc_name) dplyr
The result is a map object with a new variable storing the the greater city area population:
# A tibble: 5 × 9
gcc_code gcc_name population_2022_gcc state_code state_name areasqkm cent_lat
<chr> <chr> <dbl> <chr> <chr> <dbl> <dbl>
1 1GSYD Greater … 5302736 1 New South… 12369. -33.7
2 1RNSW Rest of … 2862995 1 New South… 788429. -32.1
3 19499 No usual… NA 1 New South… NA NA
4 19799 Migrator… NA 1 New South… NA NA
5 2GMEL Greater … 5035738 2 Victoria 9993. -37.8
# ℹ 2 more variables: cent_long <dbl>, geometry <MULTIPOLYGON [°]>
Let’s plot the map:
Looks great!
A keen viewer will now be able to quickly spot that greater Melbourne and greater Sydney are the most populated regions in Australia, even though they are some of the smallest in area.
Conclusion
Creating maps in R is a powerful way to visualize spatial data. With the ggplot2
package and the absmapsdata
and strayr
packages, it is easy to access, plot and customise map data of Australia.