Creating Custom Maps of Australia in R: A Step-by-Step Guide

Introduction

Maps allow us to easily convey spatial information. In this tutorial, I’ll show you an easy way to create maps of Australia in R and how to add additional information to the maps using colors and labels.

Load required packages:

if (!require("pacman")) install.packages("pacman")
pacman::p_load_gh("runapp-aus/strayr")
pacman::p_load_gh("yutannihilation/ggsflabel")
pacman::p_load(tidyverse, scales)

Map of Australia

Figure 1 shows a map of Australia, with greater capital city areas labeled and coloured according to the population.

Figure 1: Australian Map showing Greater City Areas coloured by Population

Let me show you how this map was created.

Accessing Map Data

Hard way

To create a map, we first need to get access to the underlying map data in ‘ESRI Shapefile format’.
The Australian Bureau of Statistics (ABS) update their map geographies regularly and maps can be downloaded in Shapefile format here. For our purpose, we will be using the Greater Capital City Statistical Areas - 2021 - Shapefile.

Easy way

There is a user-generated package called absmapsdata which makes it much easier to access ABS map data in a ready to use format.

You can download the entire absmapsdata package with:

remotes::install_github("wfmackey/absmapsdata")

Be warned, the absmapsdata package contains a lot of data!

To avoid a large download, you can use the use strayr::read_absmap function from the aptly names strayr package to access specific maps from absmapsdata, without installing the whole absmapsdata package.

First, install the the strayr package:

remotes::install_github("runapp-aus/strayr")

Then, use the strayr::read_absmap function to read the Greater Capital City Statistical Areas map object from absmapsdata:

gcc_map <- 
  strayr::read_absmap("gcc2021")

The result is a dataframe with a special column called geometry which contains the map parameters (MULTIPOLYGON Object) required to build the map:

# A tibble: 5 × 8
  gcc_code_2021 gcc_name_2021      state_code_2021 state_name_2021 areasqkm_2021
  <chr>         <chr>              <chr>           <chr>                   <dbl>
1 1GSYD         Greater Sydney     1               New South Wales        12369.
2 1RNSW         Rest of NSW        1               New South Wales       788429.
3 19499         No usual address … 1               New South Wales           NA 
4 19799         Migratory - Offsh… 1               New South Wales           NA 
5 2GMEL         Greater Melbourne  2               Victoria                9993.
# ℹ 3 more variables: cent_lat <dbl>, cent_long <dbl>,
#   geometry <MULTIPOLYGON [°]>

Plotting the Map

We can plot the map with the ggplot2 package’s geom_sf() function:

gcc_map %>% 
  ggplot2::ggplot() +
  ggplot2::geom_sf()

Now, let’s clean up the map and add some colour:

gcc_map %>% 
  dplyr::filter(!is.na(areasqkm_2021)) %>% 
  dplyr::filter(!state_name_2021 == "Other Territories") %>% 
  ggplot2::ggplot() +
  ggplot2::geom_sf(fill = "brown", colour = "black") +
  ggplot2::theme_void()

Now let’s add some labels so we can identify the greater capital city areas.

To do this we will use the ggsflabel package:

gcc_map %>% 
  dplyr::filter(!is.na(areasqkm_2021)) %>% 
  dplyr::filter(!state_name_2021 == "Other Territories") %>% 
  ggplot2::ggplot() +
  ggplot2::geom_sf(aes(geometry = geometry), 
          fill = "brown", 
          colour = "black") +
  ggsflabel::geom_sf_label_repel(aes(label = gcc_name_2021),
                                 size = 3) +
  ggplot2::theme_void()

Highlight map regions

What if we wanted to quickly identify the size of each greater capital city by area?

Rather than colouring the entire map, we can colour each greater capital city map region based on how big it is in area (km2).

To do this, we replace the geom_sf() ‘fill’ parameter with the areasqkm variable:

gcc_map %>% 
  dplyr::filter(!is.na(areasqkm_2021)) %>% 
  dplyr::filter(!state_name_2021 == "Other Territories") %>% 
  ggplot2::ggplot() +
  ggplot2::geom_sf(aes(geometry = geometry, 
              fill = areasqkm_2021), 
                    colour = "black") +
  ggsflabel::geom_sf_label_repel(aes(label = gcc_name_2021),
                                 size = 3) +
  ggplot2::theme_void()

OK. Not bad.

Let’s tidy the map by fixing the legend with the scales package and adjusting the colour palette:

gcc_map %>% 
  dplyr::filter(!is.na(areasqkm_2021)) %>% 
  dplyr::filter(!state_name_2021 == "Other Territories") %>% 
  ggplot2::ggplot() +
  ggplot2::geom_sf(aes(geometry = geometry, 
              fill = areasqkm_2021), 
                    colour = "black") +
  ggsflabel::geom_sf_label_repel(aes(label = gcc_name_2021),
                                 size = 3) +
  ggplot2::scale_fill_gradient2(  low = "red", 
                         mid = "orange", 
                         high = "steelblue", 
                         na.value = NA,
                         name = "Area (Km2)",
                         labels = scales::comma) + 
  ggplot2::theme_void() 

This is looking pretty good and makes it easy to scan the map and see that “Rest of WA” is the largest region by area.

Mapping External Data

A major benefit of maps is that you can visualise variables across regions, such as population or income.

In R, this involves joining two data sets together based on a common field.

Lets create the greater capital city area map again, this time colouring the regions by population (as shown in Figure 1).

First we need to download population data from the ABS here.

Then we import the population data as a dataframe and tidy it up into a usable format:

# A tibble: 6 × 2
  gcc_name          population_2022_gcc
  <chr>                           <dbl>
1 Rest of NSW                   2862995
2 Greater Sydney                5302736
3 Rest of Vic.                  1590226
4 Greater Melbourne             5035738
5 Greater Brisbane              2625341
6 Rest of Qld                   2695155

Now we merge the population data with the greater capital city area map data from earlier:

#remove "2021" from all column names of gcc map object
#merge the datasets based on a common variable "gcc_name"
gcc_map_merge <- 
  gcc_map %>% 
  dplyr::rename_all(., ~stringr::str_replace(., "_2021", "")) %>% 
  dplyr::left_join(., data_population_aus %>% 
                     dplyr::select(gcc_name, population_2022_gcc) %>% 
                     dplyr::distinct(), by = c('gcc_name')) %>% 
  dplyr::relocate(population_2022_gcc, .after= gcc_name)

The result is a map object with a new variable storing the the greater city area population:

# A tibble: 5 × 9
  gcc_code gcc_name  population_2022_gcc state_code state_name areasqkm cent_lat
  <chr>    <chr>                   <dbl> <chr>      <chr>         <dbl>    <dbl>
1 1GSYD    Greater …             5302736 1          New South…   12369.    -33.7
2 1RNSW    Rest of …             2862995 1          New South…  788429.    -32.1
3 19499    No usual…                  NA 1          New South…      NA      NA  
4 19799    Migrator…                  NA 1          New South…      NA      NA  
5 2GMEL    Greater …             5035738 2          Victoria      9993.    -37.8
# ℹ 2 more variables: cent_long <dbl>, geometry <MULTIPOLYGON [°]>

Let’s plot the map:

Looks great!

A keen viewer will now be able to quickly spot that greater Melbourne and greater Sydney are the most populated regions in Australia, even though they are some of the smallest in area.

Conclusion

Creating maps in R is a powerful way to visualize spatial data. With the ggplot2 package and the absmapsdata and strayr packages, it is easy to access, plot and customise map data of Australia.