Code
if (!require(pacman)) install.packages("pacman")
library(pacman)
p_load(
tidyverse, flextable, gtsummary, labelled, officer
)When publishing studies involving human participants, a demographics table is a vital tool for summarizing the characteristics of the population under investigation.
However, creating a publication-ready demographics table can be a challenging and time-consuming task. If you’ve ever struggled with this, fear not! The following article will guide you through the process of creating a publication-ready demographics table in R, using the gtsummary and flextable packages. These tools will save you time and effort, allowing you to focus on the insights from your data rather than the intricacies of table formatting.
Load required packages:
tidyverse for basic data manipulation (Wickham et al. 2019)
gtsummary for creating demographics tables (Sjoberg et al. 2021)
flextable for creating table objects (Gohel and Skintzos 2023)
labelled for adding labels to data (Larmarange 2022)
officer for exporting flextable table objects to Microsoft Word (Gohel 2023)
if (!require(pacman)) install.packages("pacman")
library(pacman)
p_load(
tidyverse, flextable, gtsummary, labelled, officer
)Figure 2 shows an example demographics table published by Ching et al. (2022) in the Ear and Hearing Journal.
Let’s create our own demographics table using a similar, but custom data set.
First we make a custom data frame which contains thirty observations of six demographic variables:
data <- data.frame(
sex = c(
"Male", "Female", "Female",
"Female", "Female", "Female", "Female", "Male", "Female",
"Female", "Male", "Female", "Female", "Female", "Male",
"Female", "Female", "Male", "Female", "Male", "Male",
"Female", "Male", "Female", "Male", "Male", "Male", "Male",
"Female", "Male"
),
mode_hearing = c(
"Bimodal", "Bilateral",
"Bilateral", "Bimodal", "Bimodal", "Bilateral", "Bimodal",
"Bilateral", "Unilateral", "Bimodal", "Unilateral",
"Bilateral", "Unilateral", "Bilateral", "Unilateral", "Bilateral",
"Bilateral", "Bilateral", "Bimodal", "Bimodal",
"Bilateral", "Bimodal", "Bilateral", "Unilateral", "Bimodal",
"Bilateral", "Bilateral", "Bimodal", "Bilateral",
"Unilateral"
),
age = c(
71L, 52L, 47L, 44L, 68L, 57L,
77L, 58L, 76L, 79L, 62L, 57L, 76L, 68L, 65L, 43L, 78L, 73L,
58L, 45L, 76L, 44L, 66L, 42L, 76L, 63L, 76L, 54L, 50L,
56L
),
hearing_loss = c(
98L, 108L, 66L, 99L, 115L, 95L,
98L, 110L, 94L, 110L, 75L, 64L, 81L, 107L, 109L, 118L,
75L, 79L, 119L, 67L, 73L, 73L, 86L, 97L, 72L, 82L, 91L,
94L, 94L, 80L
),
age_first_ci = c(
63L, 42L, 36L, 41L, 57L, 51L,
68L, 44L, 59L, 65L, 48L, 45L, 58L, 59L, 59L, 24L, 59L, 60L,
52L, 36L, 73L, 24L, 48L, 37L, 64L, 46L, 62L, 38L, 40L,
43L
),
duration_use = c(
8L, 10L, 11L, 3L, 11L, 6L, 9L,
14L, 17L, 14L, 14L, 12L, 18L, 9L, 6L, 19L, 19L, 13L, 6L,
9L, 3L, 20L, 18L, 5L, 12L, 17L, 14L, 16L, 10L, 13L
)
)
flextable::flextable(head(data))sex | mode_hearing | age | hearing_loss | age_first_ci | duration_use |
|---|---|---|---|---|---|
Male | Bimodal | 71 | 98 | 63 | 8 |
Female | Bilateral | 52 | 108 | 42 | 10 |
Female | Bilateral | 47 | 66 | 36 | 11 |
Female | Bimodal | 44 | 99 | 41 | 3 |
Female | Bimodal | 68 | 115 | 57 | 11 |
Female | Bilateral | 57 | 95 | 51 | 6 |
Then we add nicely formatted labels to all variables in the data frame using the labelled package1:
labelled::var_label(data) <-
list(
sex = "Sex",
mode_hearing = "Mode of hearing",
age = "Age at assessment (Years)",
hearing_loss = "Hearing Loss (4FA)",
age_first_ci = "Age at first CI (Years)",
duration_use = "Duration of use (Years)"
)
flextable::flextable(var_label(data) %>%
as.data.frame())sex | mode_hearing | age | hearing_loss | age_first_ci | duration_use |
|---|---|---|---|---|---|
Sex | Mode of hearing | Age at assessment (Years) | Hearing Loss (4FA) | Age at first CI (Years) | Duration of use (Years) |
We then create the demographics table with the tbl_summary() function from the gtsummary package:
# set gtsummary themes
gtsummary::reset_gtsummary_theme()
#theme_gtsummary_journal(journal = "nejm", set_theme = TRUE)
# create demographics table
basic <- data %>%
tbl_summary(
type = all_continuous() ~ "continuous2",
statistic = list(
all_continuous() ~ c(
"{mean} ({sd})",
"{median}",
"{min}, {max}"
),
all_categorical() ~ "{n} ({p}%)"
),
digits = list(all_continuous() ~ 1),
missing = "no",
) %>%
bold_labels() %>%
modify_header(
label = "",
all_stat_cols() ~ "**{level}**, N = {n} ({style_percent(p)}%)"
) %>%
gtsummary::as_flex_table()
basic %>%
flextable::autofit() %>%
flextable::width(width = 3) %>%
flextable::line_spacing(space = 0.35, part = "body")Overall, N = 30 (100%)1 | |
|---|---|
Sex | |
Female | 17 (57%) |
Male | 13 (43%) |
Mode of hearing | |
Bilateral | 14 (47%) |
Bimodal | 10 (33%) |
Unilateral | 6 (20%) |
Age at assessment (Years) | |
Mean (SD) | 61.9 (12.3) |
Median | 62.5 |
Range | 42.0, 79.0 |
Hearing Loss (4FA) | |
Mean (SD) | 91.0 (16.4) |
Median | 94.0 |
Range | 64.0, 119.0 |
Age at first CI (Years) | |
Mean (SD) | 50.0 (12.5) |
Median | 49.5 |
Range | 24.0, 73.0 |
Duration of use (Years) | |
Mean (SD) | 11.9 (4.9) |
Median | 12.0 |
Range | 3.0, 20.0 |
1n (%) | |
And there we have a demographics table that is ready to publish!
Sometimes, we we need to display demographic variables across a specific study group, such as comparing participant characteristics based on different modes of hearing.
With a few adjustments to the tbl_summary() function, we can make a new demographics table that allows us to quickly observe the differences in participant characteristics across the various hearing modes:
# create demographics table with grouping variable
gtsummary::theme_gtsummary_compact(set_theme = TRUE, font_size = 10)
adv <- data %>%
tbl_summary(
by = mode_hearing,
type = all_continuous() ~ "continuous2",
statistic = list(
all_continuous() ~ c(
"{mean} ({sd})",
"{median}",
"{min}, {max}"
),
all_categorical() ~ "{n} ({p}%)"
),
digits = list(all_continuous() ~ 1),
missing = "no",
) %>%
add_overall() %>%
add_p() %>%
bold_labels() %>%
modify_header(
label = "",
all_stat_cols() ~ "**{level}**, N = {n} ({style_percent(p)}%)"
) %>%
gtsummary::as_flex_table()
adv %>%
flextable::line_spacing(space = 1.1, part = "body") %>%
flextable::autofit()Overall, N = 30 (100%)1 | Bilateral, N = 14 (47%)1 | Bimodal, N = 10 (33%)1 | Unilateral, N = 6 (20%)1 | p-value2 | |
|---|---|---|---|---|---|
Sex | >0.9 | ||||
Female | 17 (57%) | 8 (57%) | 6 (60%) | 3 (50%) | |
Male | 13 (43%) | 6 (43%) | 4 (40%) | 3 (50%) | |
Age at assessment (Years) | >0.9 | ||||
Mean (SD) | 61.9 (12.3) | 61.7 (11.5) | 61.6 (14.3) | 62.8 (12.9) | |
Median | 62.5 | 60.5 | 63.0 | 63.5 | |
Range | 42.0, 79.0 | 43.0, 78.0 | 44.0, 79.0 | 42.0, 76.0 | |
Hearing Loss (4FA) | 0.6 | ||||
Mean (SD) | 91.0 (16.4) | 89.1 (17.1) | 94.5 (18.3) | 89.3 (12.9) | |
Median | 94.0 | 88.5 | 98.0 | 87.5 | |
Range | 64.0, 119.0 | 64.0, 118.0 | 67.0, 119.0 | 75.0, 109.0 | |
Age at first CI (Years) | >0.9 | ||||
Mean (SD) | 50.0 (12.5) | 49.2 (12.5) | 50.8 (15.1) | 50.7 (9.4) | |
Median | 49.5 | 47.0 | 54.5 | 53.0 | |
Range | 24.0, 73.0 | 24.0, 73.0 | 24.0, 68.0 | 37.0, 59.0 | |
Duration of use (Years) | 0.6 | ||||
Mean (SD) | 11.9 (4.9) | 12.5 (4.8) | 10.8 (5.0) | 12.2 (5.5) | |
Median | 12.0 | 12.5 | 10.0 | 13.5 | |
Range | 3.0, 20.0 | 3.0, 19.0 | 3.0, 20.0 | 5.0, 18.0 | |
1n (%) | |||||
2Fisher's exact test; Kruskal-Wallis rank sum test | |||||
Typically, a demographics table will need to be inserted into a Microsoft Word document where it can join the rest of a manuscript or report.
The gtsummary package - in combination with the flextable and officer packages - makes exporting demographics tables to Microsoft Word very easy.
Let’s export our demographics tables to Microsoft word using flextable and officer packages:
flextable::save_as_docx(adv,
path = "articles/demographics_table_r/demo_table_adv.docx",
pr_section =
officer::prop_section(
page_size = officer::page_size(
orient = "landscape",
width = 8.3, height = 11.7
),
type = "continuous",
page_margins = officer::page_mar()
)
)Once exported, the table can be edited just like any other Microsoft Word table.
When publishing study results, creating comprehensive and well-formatted demographics tables is essential, but it can also be time-consuming.
The gtsummary package in R drastically simplifies the process of generating publication-ready demographics tables.
With the help of flextable and officer packages, the gtsummary tables can be easily exported to Microsoft Word.
When creating the demographics table, these formatted labels will be shown in the table instead of the actual variable names. This is an important step which will save you a lot of effort renaming individual variables within the table.↩︎