Understanding Confidence Intervals: A Practical R Tutorial

Prerequisites

Load required packages:

if (!require("pacman")) install.packages("pacman")
pacman::p_load(tidyverse, ggpubr, scales)

Introduction

Confidence intervals are a statistical tool commonly used to understand the variation around an estimate and to visually represent uncertainty. This tutorial will explore what confidence intervals are, how to interpret them, and how to calculate and report them in R. Understanding what confidence intervals represent can be challenging. Therefore, a careful approach will be taken to explain and demystify their interpretation.

What are confidence intervals?

A confidence interval is a range of values that provides a measure of the uncertainty surrounding an estimate. It is calculated from a sample and is used to estimate the range within which the true population parameter is likely to fall.

As an example, suppose you are the facilities manager of a university with five thousand students enrolled, and are tasked with installing new drinking water fountains across the campus. Before beginning the installation, it is important to have an estimate of the average height of all students at the university to ensure the water fountains are of the most appropriate dimensions.

The question you aim to answer is: what is the average height of all students at the university?

One way to answer this question would be to measure the height of all five thousand students. However, this would be costly and impractical. A more feasible and cost-effective approach would be to measure the height of a randomly selected sample of students and use statistics to estimate the average height of all students. This is where confidence intervals can help!

How to calculate confidence intervals?

Let’s imagine you have measured the height of 100 randomly selected students from the university and the average height of the sample is 174.63cm with a standard deviation¹ of 7.18cm.

Let’s visualise the distribution of student heights using the ggpubr package:

data_sam %>% 
  ggpubr::ggboxplot(., 
                    y = "height", 
                    x = "group",
                    # fill = "group",
                    add = c("mean"),
                    add.params = list(color = c("steelblue"),
                                      size = 1),
                    palette = "jco",
                    xlab = " ",
                    ylab = "Height (cm)"
  ) +
  theme(legend.position = 'none',
  axis.text.x = element_blank(),
        axis.ticks.x = element_blank(),
        axis.text.y = element_text(size  = 16),
        axis.title = element_text(size = 16)) +
  geom_jitter(width = 0.19)

Figure 1: Distribution of student heights (cm). Black points represent individual student heights. The blue point is the average student height.

From here, we can construct a confidence interval using the following steps:

Decide on how much confidence we want. This is often 95% which is OK for our purpose²:

confidence_lvl <- 0.95

Compute the standard error³ of the mean:

tbl_1 <- 
  data_sam %>% 
  summarise(mean = mean(height),
            n = n(),
            sd = sd(height),
            sem = sd(height) / sqrt(n)) %>% 
  mutate_all(.funs = round,2) 

tbl_1 %>%  
  flextable::flextable() %>% 
  flextable::autofit()

mean	n	sd	sem
174.63	100	7.18	0.72

Find the t-score⁴ that corresponds to the desired confidence level:

tbl_2 <-
  tbl_1 %>%
  mutate(t_score = qt(p=(1-confidence_lvl)/2, df=(n-1), lower.tail=FALSE)) %>% 
  mutate_all(.funs = round,2) 

tbl_2 %>%  
  flextable::flextable() %>% 
  flextable::autofit()

mean	n	sd	sem	t_score
174.63	100	7.18	0.72	1.98

Calculate the margin of error⁵ and construct the confidence interval⁶:

confint_table <- 
  tbl_2 %>% 
  mutate(moe = t_score * sem) %>% 
  mutate(lowci = mean - moe,
         hici = mean + moe) %>% 
  mutate_all(.funs = round,2) 


confint_table %>%  
  flextable::flextable() %>% 
  flextable::autofit()

mean	n	sd	sem	t_score	moe	lowci	hici
174.63	100	7.18	0.72	1.98	1.43	173.2	176.06

Let’s double check our confidence interval calculation by comparing it to the t.test function in R:

check_ci <-
  t.test(data_sam$height, conf.level = 0.95)

check_ci$conf.int

[1] 173.2048 176.0552
attr(,"conf.level")
[1] 0.95

Great, our calculations are correct!

Now that we have our confidence interval, we can report the results as follows: the mean height of the university students is 174.63cm [95% CI 173.2-176.06].

How to interpret confidence intervals?

The 95% confidence interval of 173.2cm to 176.06cm represents the range within which the true average height of all university students is likely to be, with 95% confidence.

To be very technical, what this confidence interval means is that if we took many samples of university students and calculated confidence intervals each time, in the long run 95% of these intervals would contain the true average height of the population of university students.

To better understand this concept, consider a simulation where the height of all students at the university is measured, and the average height is found to be 175cm. If we then took a random sample of 100 students and calculated the average height and confidence interval, and repeated this process 99 more times, we would have the average height and confidence intervals for 100 samples of 100 students (see Figure 2).

We can see that in the long run, if we kept taking samples of 100 students and calculated 95% confidence intervals each time, 95% of the confidence intervals would contain the true average student height of 175cm.

Figure 2: The figure shows 100 samples taken from a population of students with a know average height of 175cm. The horizontal bars show the confidence interval for each sample. If the bar is grey, it means that it contains 175cm. If the bar is red, it means it does not contain 175cm.

Of course, we don’t know the true population average student height. However, the confidence interval derived from our sample is designed to provide a range that likely encompasses the true population average height at least 95% of the time if the process is repeated. Therefore, we can express “95% confidence” that our interval includes the true average height of all students.

How to report confidence intervals visually?

Confidence intervals are typically visualised as error bars surrounding some point estimate. In our previous example, we had a point estimate of the average height of university students. Let’s use the ggplot2 package to visualise the average height of students with confidence intervals as error bars:

confint_table %>% 
  mutate(group = "student") %>% 
  ggplot2::ggplot(aes(x = group, y = mean)) +
  ggplot2::geom_bar(stat = 'identity', fill = "white", color = "black", width = 0.7) +
  ggplot2::geom_errorbar(aes(ymin = lowci, ymax = hici), width = 0.1) +
  coord_cartesian(ylim = c(165,180)) +
  ggpubr::theme_pubr() +
  labs(x = " ",
       y = "Height (cm)") +
  theme(axis.text.x = element_blank(),
        axis.ticks.x = element_blank())

Figure 3: Bar plot showing the average student height with the confidence interval as error bars.

Final Thoughts

Confidence intervals are very popular and useful statistics to quantify and visualise uncertainty. However, their interpretation is not so straightforward. Luckily, there are useful tools such as this web app that simulates confidence intervals and helps with their interpretation.

References

Smithson, Michael. 1999. “Statistics with Confidence: An Introduction for Psychologists.” Statistics with Confidence, 1464. https://www.torrossa.com/gs/resourceProxy?an=4913185&publisher=FZ7200.

Footnotes

The standard deviation is a measure of the amount of variation or dispersion of a set of values. It is calculated as the square root of the variance.↩︎
Although somewhat arbitrary, the 95% confidence level was picked by Sir Ronald Fisher as a good threshold in 1925 because the two-sided z-score of 1.96 is almost exactly 2 standard deviation.↩︎
The standard error of the mean is a measure of the dispersion of sample means around the population mean. It indicates how different the population mean is likely to be from a sample mean. The standard error of the mean is calculated as the standard deviation of the sampling distribution of the mean.↩︎
A t-score is equivalent to the number of standard deviations away from the mean of the t-distribution ↩︎
The margin of error is calculated by multiplying the t-score and the standard error of the mean.↩︎
The Confidence interval is then constructed by adding and subtracting the margin of error from the mean [CI = sample_mean - margin of error, sample mean + margin of error] (Smithson 1999).↩︎