library(lme4)
library(effectsize)
library(emmeans)
library(tidyverse)
Effect Size
lmer
With linear mixed-effects models, the main problem is about which standard deviation using to standardize a certain effect.
<- 100
ns <- 50
nt <- c(0.5, -0.5)
cond <- 0.2
b0 <- 0.5
b1 <- 0.2
sb0 <- 0.05
sb1 <- 1 # residual standard deviation
s <- expand_grid(id = 1:ns, trial = 1:nt, cond = cond)
dat
<- rnorm(ns, 0, sb0)
b0i <- rnorm(ns, 0, sb1)
b1i
$y <- with(dat, b0 + b0i[id] + (b1 + b1i[id]) * cond) + rnorm(nrow(dat), 0, s)
dat
<- lmer(y ~ cond + (cond||id), data = dat) # no correlation slopes-intercept fit
Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model failed to converge with max|grad| = 0.00213929 (tol = 0.002, component 1)
summary(fit)
Linear mixed model fit by REML ['lmerMod']
Formula: y ~ cond + ((1 | id) + (0 + cond | id))
Data: dat
REML criterion at convergence: 28726.4
Scaled residuals:
Min 1Q Median 3Q Max
-3.6073 -0.6766 0.0114 0.6745 3.3385
Random effects:
Groups Name Variance Std.Dev.
id (Intercept) 0.04271 0.2067
id.1 cond 0.01067 0.1033
Residual 1.01510 1.0075
Number of obs: 10000, groups: id, 100
Fixed effects:
Estimate Std. Error t value
(Intercept) 0.21856 0.02299 9.506
cond 0.49437 0.02264 21.833
Correlation of Fixed Effects:
(Intr)
cond 0.000
optimizer (nloptwrap) convergence code: 0 (OK)
Model failed to converge with max|grad| = 0.00213929 (tol = 0.002, component 1)
In this situation we have different options. Firstly, let’s compute the standard Cohen’s \(d\) without considering the model:
::cohens_d(y ~ cond, data = dat, paired = TRUE) effectsize
Cohen's d | 95% CI
--------------------------
-0.35 | [-0.38, -0.32]
::cohens_d(y ~ cond, data = dat, paired = FALSE) effectsize
Cohen's d | 95% CI
--------------------------
-0.48 | [-0.52, -0.44]
- Estimated using pooled SD.
Then we can aggregate first and then compute the \(d\):
|>
dat group_by(id, cond) |>
summarise(y = mean(y)) |>
cohens_d(y ~ cond, data = _)
`summarise()` has grouped output by 'id'. You can override using the `.groups`
argument.
Cohen's d | 95% CI
--------------------------
-1.93 | [-2.26, -1.59]
- Estimated using pooled SD.
Here we are ignoring the uncertainty at the trial level and considering only the between-subjects differences.
From the model we can directly compute these quantities:
<- function(fit, which = NULL){
get_variance <- data.frame(VarCorr(fit))
vv <- vv$vcov
vi names(vi) <- tolower(ifelse(is.na(vv$var1), "residual", gsub("\\(|\\)", "", vv$var1)))
if(!is.null(which)){
sum(vi[which])
else{
}
vi
}
}
# using residual standard deviation
fixef(fit)["cond"] / sqrt(get_variance(fit)["residual"])
cond
0.490678
# using only random (intercepts and slopes) standard deviation
fixef(fit)["cond"] / sqrt(get_variance(fit, c("intercept", "cond")))
cond
2.139869
# total variance
fixef(fit)["cond"] / sqrt(get_variance(fit, c("intercept", "cond", "residual")))
cond
0.4782656
Comparing it with the between-subjects Cohen’s \(d\) the model is essentially always computing the between version. The paired works on the differences.
Some references:
- https://imaging.mrc-cbu.cam.ac.uk/statswiki/FAQ/tdunpaired
- https://psycnet.apa.org/record/2014-32656-001
- http://steveharoz.com/blog/2023/simulating-how-replicate-trial-count-impacts-cohens-d-effect-size/
glmer
Let’s see an example with a glm
. Let’s simulate data similar to the previous example but when the response is binary accuracy. The effect size now can be computed as odds ratio. The odds of a probability is calculated as \(p / (1 - p)\) and is interpreted as the number of successes for each failure (or the opposite). For example, \(p = 0.7\) corresponds to an odds of 2.3333333, thus 2.3333333 successes for each failure. Taking the log of the odds create a symmetric function with midpoint on zero.
<- function(p) p / (1 - p)
odds <- seq(0, 1, 0.01)
p par(mfrow = c(1,2))
plot(p, odds(p), type = "l")
plot(p, log(odds(p)), type = "l")
With two conditions or groups, we can compute the ratio of two odds (odds ratio) intepreted as the increase in the odds of success at the numerator compared to the denominator. The OR is an effect size that can be directly intepreted but can also be derived from other effect sizes.
::oddsratio_to_d(OR = 3) # d associated with an OR of 3 effectsize
[1] 0.6056967
::d_to_oddsratio(d = 0.4) # OR for a d = 0.4 effectsize
[1] 2.065805
Now we can simulate data and see how to compute an appropriate effect size:
<- expand_grid(id = 1:ns, trial = 1:nt, cond = cond)
dat
<- qlogis(0.7)
b0 <- log(2) # ~ cohen's d = 0.4
b1 <- 0.3
sb0 <- 0.05
sb1
<- rnorm(ns, 0, sb0)
b0i <- rnorm(ns, 0, sb1)
b1i
<- plogis(with(dat, b0 + b0i[id] + (b1 + b1i[id]) * cond))
p $y <- rbinom(nrow(dat), 1, p)
dat
<- glmer(y ~ cond + (cond|id), data = dat, family = binomial(link = "logit")) fit
Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model failed to converge with max|grad| = 0.00231232 (tol = 0.002, component 1)
summary(fit)
Generalized linear mixed model fit by maximum likelihood (Laplace
Approximation) [glmerMod]
Family: binomial ( logit )
Formula: y ~ cond + (cond | id)
Data: dat
AIC BIC logLik deviance df.resid
11940.5 11976.5 -5965.2 11930.5 9995
Scaled residuals:
Min 1Q Median 3Q Max
-2.4588 -1.1155 0.5340 0.6747 1.0815
Random effects:
Groups Name Variance Std.Dev. Corr
id (Intercept) 0.09035 0.3006
cond 0.01188 0.1090 -1.00
Number of obs: 10000, groups: id, 100
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.85957 0.03766 22.82 <2e-16 ***
cond 0.78746 0.04668 16.87 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Correlation of Fixed Effects:
(Intr)
cond -0.097
optimizer (Nelder_Mead) convergence code: 0 (OK)
Model failed to converge with max|grad| = 0.00231232 (tol = 0.002, component 1)
Now we can estimate our effect size. Let’s start by analyzing accuracies using a linear model:
|>
dat group_by(id, cond) |>
summarise(y = mean(y)) |>
::cohens_d(y ~ cond, paired = TRUE, data = _) effectsize
`summarise()` has grouped output by 'id'. You can override using the `.groups`
argument.
Cohen's d | 95% CI
--------------------------
-1.82 | [-2.13, -1.49]
|>
dat group_by(id, cond) |>
summarise(y = mean(y)) |>
::cohens_d(y ~ cond, paired = FALSE, data = _) effectsize
`summarise()` has grouped output by 'id'. You can override using the `.groups`
argument.
Cohen's d | 95% CI
--------------------------
-1.85 | [-2.18, -1.51]
- Estimated using pooled SD.
Now let’s transform the model coefficient into a Cohen’s \(d\):
exp(fixef(fit)) # odds ratio
(Intercept) cond
2.362138 2.197801
::logoddsratio_to_d(fixef(fit)["cond"]) effectsize
cond
0.4341481