Code
if (!require("pacman")) install.packages("pacman")
pacman::p_load(tidyverse, lme4, sjPlot, lmerTest, sjtable2df, flextable, ggpubr)Linear mixed models are a powerful statistical technique that extend simple linear regression. The key difference is that they include random effects. Random effects account for the nested structure in data, where observations are grouped into subcategories that can influence the outcome variable.
In this article, we will delve into a hypothetical data set to investigate the relationship between smiles and beers using linear mixed models in R.
if (!require("pacman")) install.packages("pacman")
pacman::p_load(tidyverse, lme4, sjPlot, lmerTest, sjtable2df, flextable, ggpubr)Imagine we have sampled 15 bars in a city and collected 234 observations of two key variables1:
The number of beers consumed by customers in each bar during a given time period.
The number of smiles observed from customers in each bar during the same time period.
The data might look something like this:
data %>%
head(5) %>%
bind_rows(data %>% tail(1)) %>%
flextable::flextable() %>%
flextable::autofit()obs | bar | smile | beer |
|---|---|---|---|
1 | a | 9 | 1 |
2 | a | 11 | 2 |
3 | a | 12 | 2 |
4 | b | 13 | 2 |
5 | b | 11 | 2 |
234 | q | 6 | 6 |
Our analysis aims to explore the relationship between the number of beers consumed and the number of smiles produced by customers.
If we disregard the fact that we sampled customers within different bars, we can model the relationship between the number of beers consumed and the number of smiles produced using simple linear regression:
data %>%
ggpubr::ggscatter(
x = "beer",
y = "smile",
color = "black", shape = 21, size = 3, # Points color, shape and size
add = "reg.line", # Add reg. line
add.params = list(color = "blue", fill = "lightgray"), # Customize reg. line
conf.int = TRUE, # Add confidence interval
) +
coord_cartesian(ylim = c(0, 15))sjtable2df::mtab2df(
sjPlot::tab_model(s_lm,
show.intercept = T,
p.style = "numeric",
digits.rsq = 1
), 1
) %>%
flextable::flextable() %>%
flextable::autofit()Predictors | Estimates | CI | p |
|---|---|---|---|
(Intercept) | 9.30 | 8.66 – 9.94 | <0.001 |
beer | -0.44 | -0.61 – -0.27 | <0.001 |
Observations | 234 | ||
R2 / R2 adjusted | 0.1 / 0.1 |
The analysis suggests a negative relationship between beer consumption and smiles, with each beer resulting in an average of 0.44 less smiles.
However, if we examine a scatter plot which highlights the clustering of observations within each bar:
data %>%
ggpubr::ggscatter(
x = "beer",
y = "smile",
fill = "bar",
color = "black", shape = 21, size = 3,
add = "reg.line",
add.params = list(color = "bar", fill = "bar"),
conf.int = F,
) +
coord_cartesian(ylim = c(0, 15)) +
theme(legend.position = "right")We can see having separate regression lines for each bar would provide a better fit to the data than using a single, fixed regression line across all bars. The simple linear regression model was biased because it failed to account for the fact that customers within the same bar tended to exhibit more similar smile patterns compared to customers across different bars.
A linear mixed model is just the tool to overcome this bias!
When running linear mixed models, we must carefully consider our data structure and specify the appropriate random effects (random intercept, random slope).
Random effects refer to the sources of random variability or noise in the data. For example, some bars may have a generally higher or lower baseline level of smiles among customers, even after accounting for the number of beers consumed. This bar-specific baseline is captured by the random intercept.
Additionally, the strength of the relationship between the number of beers consumed and the number of smiles observed may vary across different bars, which is accounted for by the random slope.
For our beer and smiles example, let’s explore linear mixed models with different combinations of random intercepts and random slopes to determine the most appropriate random effects structure.
A random intercept model allows the baseline levels of the outcome variable to vary across groups, while assuming a fixed slope. For example, this model would allow each bar to have a different average baseline number of smiles, while assuming the effect of beers is constant across bars.
The first step is to specify our model2 using the lme4 package:
lmm_ri <-
lmerTest::lmer(smile ~ 1 + beer + (1 | bar), data = data)Then we can plot the model with the sjPlot package:
sjPlot::plot_model(lmm_ri,
type = "pred",
terms = c("beer", "bar"),
pred.type = "re",
ci.lvl = NA,
show.data = T,
jitter = F,
) +
ggpubr::theme_pubr() +
theme(legend.position = "right") +
ggplot2::coord_cartesian(ylim = c(0, 15))Finally, we can look at the results:
sjtable2df::mtab2df(sjPlot::tab_model(lmm_ri,
show.intercept = T,
p.style = "numeric",
digits.rsq = 2
), 1) %>%
bind_rows(data.frame(
Predictors = "AIC",
Estimates = as.character(round(AIC(lmm_ri), 1)), CI = "", p = ""
)) %>%
flextable::flextable() %>%
flextable::autofit()Predictors | Estimates | CI | p |
|---|---|---|---|
(Intercept) | 5.84 | 4.47 – 7.21 | <0.001 |
beer | 0.55 | 0.39 – 0.71 | <0.001 |
Random Effects | |||
σ2 | 1.45 | ||
τ00bar | 5.82 | ||
ICC | 0.80 | ||
N bar | 15 | ||
Observations | 234 | ||
Marginal R2 / Conditional R2 | 0.09 / 0.82 | ||
AIC | 819.7 |
The linear mixed model shows that for each additional beer consumed, there is an average increase of 0.55 smiles. This is in contrast to the negative relationship found using simple linear regression. The linear mixed model is more appropriate because it accounts for the fact that smile counts are clustered within bars. By including a random intercept for each bar, the model separates out the variability at the bar level. This allows for a better estimate of the true relationship between beers and smiles, which is positive.
Key findings from the linear mixed model:
There is substantial variability in the average number of smiles across different bars (t00bar = 5.82). This suggests it was appropriate to allow the intercepts to vary.
The Intraclass Correlation (ICC) tells us that 80% of the total variance in smiles is due to differences between bars. This highlights the importance of accounting for the nested structure of customers within bars.
After accounting for beer consumption and bar differences, the remaining unexplained variance (𝜎2) in smiles is 1.45.
The fixed effects (beer consumption) explain 9% of the variance in smiles (marginal R-squared). The fixed and random effects together explain 82% of the variance (conditional R-squared). This indicates that incorporating the bar-level grouping structure explains a significant portion of the total variance.
The Akaike Information Criterion (AIC) of 819.7 can be used to compare the relative quality of different random effect specifications, with lower values indicating better fit.
This type of model is appropriate when there is reason to expect variation in both the baseline number of smiles and the effect of beers on smiles across bars, and both sources of variation are of interest.
Let’s fit the model using the lme4 package:
lmm_ris <-
lmerTest::lmer(smile ~ 1 + beer + (1 + beer | bar), data = data)Then plot the model:
sjPlot::plot_model(lmm_ris,
type = "pred",
terms = c("beer", "bar"),
pred.type = "re",
ci.lvl = NA,
show.data = T,
jitter = F
) +
ggpubr::theme_pubr() +
theme(legend.position = "right") +
ggplot2::coord_cartesian(ylim = c(0, 15))And print the results:
sjtable2df::mtab2df(sjPlot::tab_model(lmm_ris,
show.intercept = T,
p.style = "numeric",
digits.rsq = 2
), 1) %>%
bind_rows(data.frame(
Predictors = "AIC",
Estimates = as.character(round(AIC(lmm_ris), 1)), CI = "", p = ""
)) %>%
flextable::flextable() %>%
flextable::autofit()Predictors | Estimates | CI | p |
|---|---|---|---|
(Intercept) | 5.67 | 4.07 – 7.28 | <0.001 |
beer | 0.56 | 0.37 – 0.74 | <0.001 |
Random Effects | |||
σ2 | 1.43 | ||
τ00bar | 8.34 | ||
τ11bar.beer | 0.03 | ||
ρ01bar | -0.84 | ||
ICC | 0.80 | ||
N bar | 15 | ||
Observations | 234 | ||
Marginal R2 / Conditional R2 | 0.09 / 0.82 | ||
AIC | 822 |
The results from this model are very similar to the random intercept model, showing that the variability of the slopes (t11bar.beer = 0.03) do not influence the interpretation of the results in a substantial way.
This model allows the effect of a predictor (i.e., the slope) to vary across groups, while assuming a constant overall intercept or baseline across groups. For example, a random slope (with fixed intercept) might be useful if we expect the baseline smile levels to be similar across bars (fixed intercept), but the effect of beers on smiles could vary substantially between bars (random slope).
The first step is to specify the model using the lme4 package:
lmm_fi <-
lmerTest::lmer(smile ~ 1 + beer + (0 + beer | bar), data = data)Then we can plot the model:
sjPlot::plot_model(lmm_fi,
type = "pred",
terms = c("beer", "bar"),
pred.type = "re", ci.lvl = NA,
show.data = T,
jitter = F
) +
ggpubr::theme_pubr() +
theme(legend.position = "right") +
ggplot2::coord_cartesian(ylim = c(0, 15))Finally, we can look at the results:
sjtable2df::mtab2df(sjPlot::tab_model(lmm_fi,
show.intercept = T,
p.style = "numeric",
digits.rsq = 2
), 1) %>%
bind_rows(data.frame(
Predictors = "AIC",
Estimates = as.character(round(AIC(lmm_fi), 1)), CI = "", p = ""
)) %>%
filter(!Predictors %in% c("τ00", "ρ01")) %>%
flextable::flextable() %>%
flextable::autofit()Predictors | Estimates | CI | p |
|---|---|---|---|
(Intercept) | 7.14 | 6.55 – 7.72 | <0.001 |
beer | 0.38 | -0.00 – 0.76 | 0.053 |
Random Effects | |||
σ2 | 1.97 | ||
τ11bar.beer | 0.44 | ||
ICC | 0.76 | ||
N bar | 15 | ||
Observations | 234 | ||
Marginal R2 / Conditional R2 | 0.04 / 0.77 | ||
AIC | 885.9 |
The fixed slope analysis does not reveal a significant relationship between beer consumption and smiles, but the model appears to be a poor fit to the data3.
The performance package provides a convenient way to assess the random effects structure in mixed models by computing and comparing important fit indices. This helps determine if the random effects are necessary and appropriately specified.
Let’s compare our linear mixed models using the performance package:
performance::compare_performance(lmm_fi,
lmm_ri,
lmm_ris,
verbose = F,
metrics = "common"
) %>%
as.data.frame() %>%
mutate(across(where(is.numeric), ~ round(., 2))) %>%
select(-contains(c("wt", "model"))) %>%
flextable::flextable() %>%
flextable::autofit()Name | AIC | BIC | R2_conditional | R2_marginal | ICC | RMSE |
|---|---|---|---|---|---|---|
lmm_fi | 883.68 | 897.50 | 0.77 | 0.04 | 0.76 | 1.36 |
lmm_ri | 817.43 | 831.25 | 0.82 | 0.09 | 0.80 | 1.16 |
lmm_ris | 819.86 | 840.60 | 0.82 | 0.09 | 0.80 | 1.15 |
The random intercept model (lmm_ri) seems to be the most suitable model overall. It has the lowest Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC) values, indicating better fit while accounting for model complexity. Additionally, it shows the highest conditional R-squared (R2) and intraclass correlation coefficient (ICC), suggesting it explains the most variance in the number of smiles.
However, the random intercept and slope model (lmm_ris) also performs strongly. In fact, it demonstrates the best predictive accuracy, with the lowest root mean squared error (RMSE). Incorporating both a random intercept and random slope does not substantially alter the interpretation of the model results compared to the random intercept only model.
In contrast, the random slope model with a fixed intercept (lmm_fi) appears to be the poorest fitting model of the three.
In summary, while the random intercept model is the preferred choice overall based on information criteria and variance explained, the random intercept and slope model should be considered if maximizing predictive accuracy is the primary goal. The random slope only model is not recommended based on these results.
Once a model has been selected, it is important to check the underlying assumptions.
A very easy way to do this is to use the check_model() function from the performance package:
performance::check_model(lmm_ris)Overall, the core assumptions4 appear to be intact and have not been seriously violated.
Linear mixed models are a valuable tool for analyzing complex data structures. By incorporating random effects, they can provide more accurate and informative results compared to simpler models. However, careful consideration of model specification is essential. R packages like sjPlot and performance can facilitate the visualization and interpretation of linear mixed models.
The beers data set is take from the jamovi data library. Inspiration for using the beers data set came from Marcello Gallucci’s fantastic article.↩︎
smile is the dependent or outcome variable. beer is an independent variable (a predictor) in the model. (1 | bar) is a random-effects term, where 1 represents the intercept, and bar is a grouping variable. This term accounts for the random variation associated with the bar groups.↩︎
The intercept is 7.14, and the effect of beer consumption on smiles (0.38, 95% CI: 0.00-0.76) is not statistically significant. The random effects show a small amount of variability in slopes across bars (t11bar.beer = 0.44). The residual variance (𝜎2 = 1.97) is higher than the random intercept model, indicating more unexplained variance. The AIC (885.9) is higher than the random intercept model, suggesting that the fixed slope model fits the data worse than the random intercept model, even after accounting for the complexity of the models.↩︎
See Chapter 18: Testing the Assumptions of Multilevel Models Michael Palmeri https://ademos.people.uic.edu/Chapter18.html#51_examine_results,_what_do_they_mean↩︎