Teaching resources for statistics and R: Monte Carlo simulation of t-test p-values

Why doesn't a small p-values automatically prove the alternative hypothesis? In this third article of my series on teaching resources for stats and R, I share a graphical Monte Carlo simulation I developed to teach my students about classical hypothesis testing and p-values.

I developed this demonstration in R to teach my students about classical hypothesis testing and p-values.

In this Monte Carlo simulation, two random samples are taken from a single, known population; in this case, the population is normally distributed, as can be seen in the grey histogram on the left. One sample is represented by blue dots and the other by red dots, with the mean of each sample represented by a vertical line of the same color.

A t-test is performed on the means of the two samples to compute a p-value. This procedure is repeated 4,999 more times; each time, the new p-value is added to the growing histogram of p-values on the right.

The purpose of this simulation is to demonstrate that p < 0.05 does not prove the alternative hypothesis. In this simulation, the alternative hypothesis is that the two samples come from different populations. And while about 5% of the p-values in the simulation do fall below 0.05, we know for a fact that the alternative hypothesis is false, because the simulated samples are coming from the same population.

A video of this Monte Carlo simulation in action. IF YOU ARE SENSITIVE TO RAPIDLY FLASHING IMAGES, I SUGGEST YOU SKIP AHEAD TO THE END OF THE VIDEO.

The code I used to perform this simulation is below. Note that parameters such as population size, population distribution, sample size, number of times to repeat the simulation, and the alpha level can all be adjusted by modifying the first several lines of code.

### DEMO - Monte Carlo simulation: interpreting a t-test p-value --------------------
# Written by Dr. Taylan Morcol, PhD (https://morcol.com)

N <- 1000000 # population size

population <- rnorm(N) # random sample of the standard normal distribution
hist(population) # visualize the population

NS <- 5000 # number of times to sample
n <- 3 # sample size
a <- 0.05 # alpha level (tolerance for Type I errors)
plot.t <- FALSE # should histogram of t-values be included?

par(mfrow=c(1,if(plot.t){3}else{2})) # 1 x 2 graphics window

p <- c() # empty vector to hold p-values
tvals <- c() # empty vector to hold t-values
for(i in 1:NS){
   s1 <- sample(population, n, replace=TRUE)
   s2 <- sample(population, n, replace=TRUE)
   
   out <- t.test(x=s1, y=s2, var.equal=TRUE)
   p.i <- out$p.value
   p <- c(p, p.i)
   t.i <- out$statistic
   tvals <- c(tvals, t.i)
   
   p.title <- paste("sample #", i, "( n =", n, ") | p =", signif(p.i, 3), if(p.i <= a){"*"})
   hist(population, main=p.title) # histogram of population
   
   points(x=s1, y=rep(0, n), type="p", pch=19, col="red") # sample 1 points
   abline(v=out$estimate[1], lwd=2, col="red") # sample 1 mean
   
   points(x=s2, y=rep(0, n), type="p", pch=17, col="blue") # sample 2 points
   abline(v=out$estimate[2], lwd=2, col="blue") # sample 2 mean
   
   prop.p <- sum(p < a)/length(p) # proportion of p-values less than alpha
   
   hist(p, col="lightblue", breaks=1/a # histogram of all p-values so far
      , xlim=c(0,1), xlab=""
      , main="All p-values"
      , sub=paste("Proportion of p-values < \U03B1:", signif(prop.p,3))
   ) # end hist()
   
   abline(v=a, col="darkgreen", xpd=TRUE) # line at cutoff for significance
   mtext(paste(" <-- \U03B1", "=", a), side=3, at=a, adj=0) # label for line above
   
   if(plot.t){hist(tvals, col="peru")} # end hist()
   
   # slower in beginning (for screen recording); comment out if not recording
   #if(i < 5){Sys.sleep(4)} else if(i < 10){Sys.sleep(2)} else if(i < 15){Sys.sleep(1)}
   
   #readline(prompt="Press [Enter] to continue...") # comment out if non-interactive
   
} # end for() loop

Leave a Reply

Your email address will not be published. Required fields are marked *