Code
if (!require(pacman)) install.packages("pacman")
library(pacman)
p_load(
tidyverse, flextable, gtsummary, labelled, officer )
When publishing studies involving human participants, a demographics table is a vital tool for summarizing the characteristics of the population under investigation.
However, creating a publication-ready demographics table can be a challenging and time-consuming task. If you’ve ever struggled with this, fear not! The following article will guide you through the process of creating a publication-ready demographics table in R, using the gtsummary
and flextable
packages. These tools will save you time and effort, allowing you to focus on the insights from your data rather than the intricacies of table formatting.
Load required packages:
tidyverse
for basic data manipulation (Wickham et al. 2019)
gtsummary
for creating demographics tables (Sjoberg et al. 2021)
flextable
for creating table objects (Gohel and Skintzos 2023)
labelled
for adding labels to data (Larmarange 2022)
officer
for exporting flextable table objects to Microsoft Word (Gohel 2023)
if (!require(pacman)) install.packages("pacman")
library(pacman)
p_load(
tidyverse, flextable, gtsummary, labelled, officer )
Figure 2 shows an example demographics table published by Ching et al. (2022) in the Ear and Hearing Journal.
Let’s create our own demographics table using a similar, but custom data set.
First we make a custom data frame which contains thirty observations of six demographic variables:
<- data.frame(
data sex = c(
"Male", "Female", "Female",
"Female", "Female", "Female", "Female", "Male", "Female",
"Female", "Male", "Female", "Female", "Female", "Male",
"Female", "Female", "Male", "Female", "Male", "Male",
"Female", "Male", "Female", "Male", "Male", "Male", "Male",
"Female", "Male"
),mode_hearing = c(
"Bimodal", "Bilateral",
"Bilateral", "Bimodal", "Bimodal", "Bilateral", "Bimodal",
"Bilateral", "Unilateral", "Bimodal", "Unilateral",
"Bilateral", "Unilateral", "Bilateral", "Unilateral", "Bilateral",
"Bilateral", "Bilateral", "Bimodal", "Bimodal",
"Bilateral", "Bimodal", "Bilateral", "Unilateral", "Bimodal",
"Bilateral", "Bilateral", "Bimodal", "Bilateral",
"Unilateral"
),age = c(
71L, 52L, 47L, 44L, 68L, 57L,
77L, 58L, 76L, 79L, 62L, 57L, 76L, 68L, 65L, 43L, 78L, 73L,
58L, 45L, 76L, 44L, 66L, 42L, 76L, 63L, 76L, 54L, 50L,
56L
),hearing_loss = c(
98L, 108L, 66L, 99L, 115L, 95L,
98L, 110L, 94L, 110L, 75L, 64L, 81L, 107L, 109L, 118L,
75L, 79L, 119L, 67L, 73L, 73L, 86L, 97L, 72L, 82L, 91L,
94L, 94L, 80L
),age_first_ci = c(
63L, 42L, 36L, 41L, 57L, 51L,
68L, 44L, 59L, 65L, 48L, 45L, 58L, 59L, 59L, 24L, 59L, 60L,
52L, 36L, 73L, 24L, 48L, 37L, 64L, 46L, 62L, 38L, 40L,
43L
),duration_use = c(
8L, 10L, 11L, 3L, 11L, 6L, 9L,
14L, 17L, 14L, 14L, 12L, 18L, 9L, 6L, 19L, 19L, 13L, 6L,
9L, 3L, 20L, 18L, 5L, 12L, 17L, 14L, 16L, 10L, 13L
)
)
::flextable(head(data)) flextable
sex | mode_hearing | age | hearing_loss | age_first_ci | duration_use |
---|---|---|---|---|---|
Male | Bimodal | 71 | 98 | 63 | 8 |
Female | Bilateral | 52 | 108 | 42 | 10 |
Female | Bilateral | 47 | 66 | 36 | 11 |
Female | Bimodal | 44 | 99 | 41 | 3 |
Female | Bimodal | 68 | 115 | 57 | 11 |
Female | Bilateral | 57 | 95 | 51 | 6 |
Then we add nicely formatted labels to all variables in the data frame using the labelled
package1:
::var_label(data) <-
labelledlist(
sex = "Sex",
mode_hearing = "Mode of hearing",
age = "Age at assessment (Years)",
hearing_loss = "Hearing Loss (4FA)",
age_first_ci = "Age at first CI (Years)",
duration_use = "Duration of use (Years)"
)
::flextable(var_label(data) %>%
flextableas.data.frame())
sex | mode_hearing | age | hearing_loss | age_first_ci | duration_use |
---|---|---|---|---|---|
Sex | Mode of hearing | Age at assessment (Years) | Hearing Loss (4FA) | Age at first CI (Years) | Duration of use (Years) |
We then create the demographics table with the tbl_summary()
function from the gtsummary
package:
# set gtsummary themes
::reset_gtsummary_theme()
gtsummary#theme_gtsummary_journal(journal = "nejm", set_theme = TRUE)
# create demographics table
<- data %>%
basic tbl_summary(
type = all_continuous() ~ "continuous2",
statistic = list(
all_continuous() ~ c(
"{mean} ({sd})",
"{median}",
"{min}, {max}"
),all_categorical() ~ "{n} ({p}%)"
),digits = list(all_continuous() ~ 1),
missing = "no",
%>%
) bold_labels() %>%
modify_header(
label = "",
all_stat_cols() ~ "**{level}**, N = {n} ({style_percent(p)}%)"
%>%
) ::as_flex_table()
gtsummary
%>%
basic ::autofit() %>%
flextable::width(width = 3) %>%
flextable::line_spacing(space = 0.35, part = "body") flextable
Overall, N = 30 (100%)1 | |
---|---|
Sex | |
Female | 17 (57%) |
Male | 13 (43%) |
Mode of hearing | |
Bilateral | 14 (47%) |
Bimodal | 10 (33%) |
Unilateral | 6 (20%) |
Age at assessment (Years) | |
Mean (SD) | 61.9 (12.3) |
Median | 62.5 |
Range | 42.0, 79.0 |
Hearing Loss (4FA) | |
Mean (SD) | 91.0 (16.4) |
Median | 94.0 |
Range | 64.0, 119.0 |
Age at first CI (Years) | |
Mean (SD) | 50.0 (12.5) |
Median | 49.5 |
Range | 24.0, 73.0 |
Duration of use (Years) | |
Mean (SD) | 11.9 (4.9) |
Median | 12.0 |
Range | 3.0, 20.0 |
1n (%) |
And there we have a demographics table that is ready to publish!
Sometimes, we we need to display demographic variables across a specific study group, such as comparing participant characteristics based on different modes of hearing.
With a few adjustments to the tbl_summary()
function, we can make a new demographics table that allows us to quickly observe the differences in participant characteristics across the various hearing modes:
# create demographics table with grouping variable
::theme_gtsummary_compact(set_theme = TRUE, font_size = 10)
gtsummary
<- data %>%
adv tbl_summary(
by = mode_hearing,
type = all_continuous() ~ "continuous2",
statistic = list(
all_continuous() ~ c(
"{mean} ({sd})",
"{median}",
"{min}, {max}"
),all_categorical() ~ "{n} ({p}%)"
),digits = list(all_continuous() ~ 1),
missing = "no",
%>%
) add_overall() %>%
add_p() %>%
bold_labels() %>%
modify_header(
label = "",
all_stat_cols() ~ "**{level}**, N = {n} ({style_percent(p)}%)"
%>%
) ::as_flex_table()
gtsummary
%>%
adv ::line_spacing(space = 1.1, part = "body") %>%
flextable::autofit() flextable
Overall, N = 30 (100%)1 | Bilateral, N = 14 (47%)1 | Bimodal, N = 10 (33%)1 | Unilateral, N = 6 (20%)1 | p-value2 | |
---|---|---|---|---|---|
Sex | >0.9 | ||||
Female | 17 (57%) | 8 (57%) | 6 (60%) | 3 (50%) | |
Male | 13 (43%) | 6 (43%) | 4 (40%) | 3 (50%) | |
Age at assessment (Years) | >0.9 | ||||
Mean (SD) | 61.9 (12.3) | 61.7 (11.5) | 61.6 (14.3) | 62.8 (12.9) | |
Median | 62.5 | 60.5 | 63.0 | 63.5 | |
Range | 42.0, 79.0 | 43.0, 78.0 | 44.0, 79.0 | 42.0, 76.0 | |
Hearing Loss (4FA) | 0.6 | ||||
Mean (SD) | 91.0 (16.4) | 89.1 (17.1) | 94.5 (18.3) | 89.3 (12.9) | |
Median | 94.0 | 88.5 | 98.0 | 87.5 | |
Range | 64.0, 119.0 | 64.0, 118.0 | 67.0, 119.0 | 75.0, 109.0 | |
Age at first CI (Years) | >0.9 | ||||
Mean (SD) | 50.0 (12.5) | 49.2 (12.5) | 50.8 (15.1) | 50.7 (9.4) | |
Median | 49.5 | 47.0 | 54.5 | 53.0 | |
Range | 24.0, 73.0 | 24.0, 73.0 | 24.0, 68.0 | 37.0, 59.0 | |
Duration of use (Years) | 0.6 | ||||
Mean (SD) | 11.9 (4.9) | 12.5 (4.8) | 10.8 (5.0) | 12.2 (5.5) | |
Median | 12.0 | 12.5 | 10.0 | 13.5 | |
Range | 3.0, 20.0 | 3.0, 19.0 | 3.0, 20.0 | 5.0, 18.0 | |
1n (%) | |||||
2Fisher's exact test; Kruskal-Wallis rank sum test |
Typically, a demographics table will need to be inserted into a Microsoft Word document where it can join the rest of a manuscript or report.
The gtsummary
package - in combination with the flextable
and officer
packages - makes exporting demographics tables to Microsoft Word very easy.
Let’s export our demographics tables to Microsoft word using flextable
and officer
packages:
::save_as_docx(adv,
flextablepath = "articles/demographics_table_r/demo_table_adv.docx",
pr_section =
::prop_section(
officerpage_size = officer::page_size(
orient = "landscape",
width = 8.3, height = 11.7
),type = "continuous",
page_margins = officer::page_mar()
) )
Once exported, the table can be edited just like any other Microsoft Word table.
When publishing study results, creating comprehensive and well-formatted demographics tables is essential, but it can also be time-consuming.
The gtsummary
package in R drastically simplifies the process of generating publication-ready demographics tables.
With the help of flextable
and officer
packages, the gtsummary
tables can be easily exported to Microsoft Word.
When creating the demographics table, these formatted labels will be shown in the table instead of the actual variable names. This is an important step which will save you a lot of effort renaming individual variables within the table.↩︎