重复测量方差分析

statistics
Published

November 10, 2024

Modified

November 10, 2024

含分组因子的重复测量

Show the code
# 加载必要的包
library(tidyverse)
library(afex)      # 进行方差分析
library(emmeans)   # 事后分析
library(car)       # 用于假设检验
library(MASS)      # 用于正态性检验


set.seed(123)
df <- tibble(
    id=1:30 %>% as.factor(),
    group = rep(c("A","B","C"), each = 10) %>% as.factor(),
    t1 = rnorm(30, mean = 8, sd = 1),
    t2 = rnorm(30, mean = 7, sd = 2),
    t3 = rnorm(30, mean = 3, sd = 1)
    
)
# 数据准备:已转换为长格式的 df_long
df_long <- df %>%
    pivot_longer(cols = starts_with("t"),
                 names_to = "time",
                 values_to = "volume") %>%
    mutate(time = factor(time, levels = c("t1", "t2", "t3")),
           group = factor(group))
Show the code
# 重复测量方差分析
anova_results <- aov_ez(id = "id", 
                        dv = "volume", 
                        within = "time", 
                        between = "group", 
                        data = df_long)
anova_results %>% summary()

Univariate Type III Repeated-Measures ANOVA Assuming Sphericity

            Sum Sq num Df Error SS den Df   F value Pr(>F)    
(Intercept) 3361.4      1   32.358     27 2804.7682 <2e-16 ***
group          1.7      2   32.358     27    0.7143 0.4986    
time         434.1      2   91.436     54  128.1956 <2e-16 ***
group:time     5.2      4   91.436     54    0.7746 0.5465    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


Mauchly Tests for Sphericity

           Test statistic   p-value
time               0.5916 0.0010873
group:time         0.5916 0.0010873


Greenhouse-Geisser and Huynh-Feldt Corrections
 for Departure from Sphericity

            GG eps Pr(>F[GG])    
time       0.71002  9.551e-16 ***
group:time 0.71002      0.509    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

              HF eps   Pr(>F[HF])
time       0.7381052 2.805049e-16
group:time 0.7381052 5.132499e-01
Show the code
anova_results$lm

Call:
lm(formula = cbind(t1, t2, t3) ~ group, data = structure(list(
    id = structure(1:30, levels = c("1", "2", "3", "4", "5", 
    "6", "7", "8", "9", "10", "11", "12", "13", "14", "15", "16", 
    "17", "18", "19", "20", "21", "22", "23", "24", "25", "26", 
    "27", "28", "29", "30"), class = "factor"), group = structure(c(1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 
    2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), levels = c("A", 
    "B", "C"), class = "factor", contrasts = "contr.sum"), t1 = c(7.43952435344779, 
    7.76982251051672, 9.55870831414912, 8.07050839142458, 8.12928773516095, 
    9.71506498688328, 8.4609162059892, 6.73493876539347, 7.31314714810647, 
    7.55433802990004, 9.22408179743946, 8.35981382705736, 8.40077145059405, 
    8.11068271594512, 7.44415886524593, 9.78691313680308, 8.49785047822924, 
    6.03338284337036, 8.70135590156369, 7.52720859227207, 6.93217629401316, 
    7.78202508534171, 6.97399555169276, 7.27110877070886, 7.37496073215074, 
    6.31330668925759, 8.83778704449452, 8.15337311783652, 6.86186306298805, 
    9.25381492106993), t2 = c(7.85292844295363, 6.40985703401546, 
    8.79025132209004, 8.75626697506608, 8.64316216327497, 8.37728050820018, 
    8.10783530707518, 6.87617657884656, 6.38807467252017, 6.23905799797523, 
    5.61058604215897, 6.5841654439608, 4.46920729686347, 11.337911930677, 
    9.41592399660998, 4.7537828335933, 6.19423032940185, 6.06668929275356, 
    8.55993023667263, 6.83326186705634, 7.50663702798951, 6.94290648930259, 
    6.91425908541737, 9.73720456802891, 6.54845802868146, 10.0329412088591, 
    3.90249439153956, 8.16922749927214, 7.24770848768923, 7.43188313748795
    ), t3 = c(3.37963948275988, 2.4976765468907, 2.66679261633058, 
    1.98142461689291, 1.92820877352442, 3.30352864140426, 3.44820977862943, 
    3.0530042267305, 3.92226746787974, 5.05008468562714, 2.50896883394346, 
    0.690831124359188, 4.00573852446226, 2.29079923741761, 2.31199138353264, 
    4.0255713696967, 2.71522699294899, 1.77928228774546, 3.18130347974915, 
    2.86110863756096, 3.00576418589989, 3.38528040112633, 2.62933996820759, 
    3.64437654851883, 2.77951343818125, 3.3317819639157, 4.09683901314935, 
    3.4351814908338, 2.67406841446877, 4.14880761845109)), row.names = c(NA, 
-30L), class = "data.frame"))

Coefficients:
             t1        t2        t3      
(Intercept)   7.95290   7.35668   3.02442
group1        0.12173   0.28741   0.09866
group2        0.25573  -0.37411  -0.38734
Show the code
anova_results$lm$contrasts
$group
[1] "contr.sum"
Show the code
contr.sum(c("A","B","C"))
  [,1] [,2]
A    1    0
B    0    1
C   -1   -1

anova_results$lm 中,group 因子采用了 contr.sum 对比编码,而手动拟合的模型(lm(cbind(t1, t2, t3) ~ group, data = df)) 默认使用的是 contr.treatment,即 “treatment” 对比编码(参考水平为第一个因子水平)。这两种对比编码方式的结果系数解释略有不同,从而导致了系数值的差异。

mlm

Show the code
lm_aov <-  lm(formula = cbind(t1, t2, t3) ~ group, data = df)

lm_aov$contrasts
$group
[1] "contr.treatment"
Show the code
contr.treatment(c("A","B","C"))
  B C
A 0 0
B 1 0
C 0 1
Show the code
# 使用 contr.sum 对比编码
contrasts(df$group) <- contr.sum(levels(df$group))


df$group
 [1] A A A A A A A A A A B B B B B B B B B B C C C C C C C C C C
attr(,"contrasts")
  [,1] [,2]
A    1    0
B    0    1
C   -1   -1
Levels: A B C
Show the code
lm_aov2 <-  lm(formula = cbind(t1, t2, t3) ~ group, data = df)

lm_aov2

Call:
lm(formula = cbind(t1, t2, t3) ~ group, data = df)

Coefficients:
             t1        t2        t3      
(Intercept)   7.95290   7.35668   3.02442
group1        0.12173   0.28741   0.09866
group2        0.25573  -0.37411  -0.38734

contr.treatment 编码(Treatment Coding)

  • 定义:也称为“dummy”编码或“参考水平编码”。

  • 目的:在回归模型中将第一个水平(参考水平)作为基准。

  • 机制:模型中每个因子水平的系数表示相对于参考水平的偏离。

  • 编码方式:例如对于一个三水平因子 group = {A, B, C},使用 contr.treatment 时,模型中会以 A 作为基准,生成两个对比变量 groupBgroupC。这两个对比变量的系数解释如下:

    • groupB 的系数表示 B 相对于 A 的平均差异。

    • groupC 的系数表示 C 相对于 A 的平均差异。

contr.sum 编码(Sum Coding)

  • 定义:也称为“和对比编码”或“效果编码”。

  • 目的:确保所有因子水平的系数和为 0。

  • 机制:每个因子水平的系数表示该水平相对于因子的总体平均值的偏差。

  • 编码方式:对于同样的三水平因子 group = {A, B, C},使用 contr.sum 时,模型会生成两个对比变量(与水平数减 1 一致),但它们的解释不同于 contr.treatment

    • group1 表示 A 和总体平均值的偏差。

    • group2 表示 B 和总体平均值的偏差。

    • C 的偏差通过 group1group2 的相加可以得到。

球形检验

检验方差-协方差矩阵是否满足球形假设

Show the code
df_list <- df %>% group_split(group)

df_matrix <- df_list %>%
  lapply(function(x) select(x, t1, t2, t3)) %>%  # 提取t1, t2, t3列
  bind_cols() %>%  # 将每个数据框按列合并
  as.matrix() 
colnames(df_matrix) <- paste(rep(c("A", "B", "C"), each = 3), rep(1:3, 3), sep = "")

rownames(df_matrix) <- 1:10
df_matrix
         A1       A2       A3       B1        B2        B3       C1        C2
1  7.439524 7.852928 3.379639 9.224082  5.610586 2.5089688 6.932176  7.506637
2  7.769823 6.409857 2.497677 8.359814  6.584165 0.6908311 7.782025  6.942906
3  9.558708 8.790251 2.666793 8.400771  4.469207 4.0057385 6.973996  6.914259
4  8.070508 8.756267 1.981425 8.110683 11.337912 2.2907992 7.271109  9.737205
5  8.129288 8.643162 1.928209 7.444159  9.415924 2.3119914 7.374961  6.548458
6  9.715065 8.377281 3.303529 9.786913  4.753783 4.0255714 6.313307 10.032941
7  8.460916 8.107835 3.448210 8.497850  6.194230 2.7152270 8.837787  3.902494
8  6.734939 6.876177 3.053004 6.033383  6.066689 1.7792823 8.153373  8.169227
9  7.313147 6.388075 3.922267 8.701356  8.559930 3.1813035 6.861863  7.247708
10 7.554338 6.239058 5.050085 7.527209  6.833262 2.8611086 9.253815  7.431883
         C3
1  3.005764
2  3.385280
3  2.629340
4  3.644377
5  2.779513
6  3.331782
7  4.096839
8  3.435181
9  2.674068
10 4.148808
Show the code
# 多元线性回归

mlmfit <- lm(df_matrix ~ 1)  # 内部对比(intrasubject contrasts)


mauchly.test(mlmfit, X = ~1) # 这个模型只是估计了反应变量的总体均值,忽略了任何自变量。换句话说,它只计算数据的平均值,并检查这些平均值是否满足球形假设。

    Mauchly's test of sphericity
    Contrasts orthogonal to
    ~1


data:  SSD matrix from lm(formula = df_matrix ~ 1)
W = 5.9785e-06, p-value = 0.000594
Show the code
# 列结构

idata <- data.frame(
    Group = factor(rep(c("A","B","C"),each = 3)),
    Time = rep(1:3,3)
)

mlmfit <- lm(cbind(t1,t2,t3) ~ group, data = df)
mlmfit 

Call:
lm(formula = cbind(t1, t2, t3) ~ group, data = df)

Coefficients:
             t1        t2        t3      
(Intercept)   7.95290   7.35668   3.02442
group1        0.12173   0.28741   0.09866
group2        0.25573  -0.37411  -0.38734
Show the code
manovafit <- manova(cbind(t1, t2, t3) ~ group, data = df)
Show the code
reacttime <- matrix(c(
420, 420, 480, 480, 600, 780,
420, 480, 480, 360, 480, 600,
480, 480, 540, 660, 780, 780,
420, 540, 540, 480, 780, 900,
540, 660, 540, 480, 660, 720,
360, 420, 360, 360, 480, 540,
480, 480, 600, 540, 720, 840,
480, 600, 660, 540, 720, 900,
540, 600, 540, 480, 720, 780,
480, 420, 540, 540, 660, 780),
ncol = 6, byrow = TRUE,
dimnames = list(subj = 1:10,
              cond = c("deg0NA", "deg4NA", "deg8NA",
                       "deg0NP", "deg4NP", "deg8NP")))

mlmfit <- lm(reacttime ~ 1)

mlmfit

Call:
lm(formula = reacttime ~ 1)

Coefficients:
             deg0NA  deg4NA  deg8NA  deg0NP  deg4NP  deg8NP
(Intercept)  462     510     528     492     660     762   
Show the code
mauchly.test(mlmfit, X = ~1)

    Mauchly's test of sphericity
    Contrasts orthogonal to
    ~1


data:  SSD matrix from lm(formula = reacttime ~ 1)
W = 0.031084, p-value = 0.04765
Show the code
idata <- data.frame(deg = gl(3, 1, 6, labels = c(0,4,8)),
                    noise = gl(2, 3, 6, labels = c("A","P")))

# 指定模型中的自变量 表示要检验这两个因素是否满足球形假设
mauchly.test(mlmfit, X = ~ deg + noise, idata = idata)

    Mauchly's test of sphericity
    Contrasts orthogonal to
    ~deg + noise


data:  SSD matrix from lm(formula = reacttime ~ 1)
W = 0.89378, p-value = 0.6381
Show the code
# 指定模型中包含的因子(自变量) 仅检验 noise(即噪声条件的影响)是否满足球形假设。
mauchly.test(mlmfit, M = ~ deg + noise, X = ~ noise, idata = idata)

    Mauchly's test of sphericity
    Contrasts orthogonal to
    ~noise

    Contrasts spanned by
    ~deg + noise


data:  SSD matrix from lm(formula = reacttime ~ 1)
W = 0.96011, p-value = 0.8497
Show the code
mauchly <- df_long %>% rstatix::anova_test(
    dv = volume,
    wid = id,
    between = group,
    within = time,
    type = "3",detailed = T
)
mauchly$`Mauchly's Test for Sphericity`
      Effect     W     p p<.05
1       time 0.592 0.001     *
2 group:time 0.592 0.001     *
Show the code
# 正态性假设检验
shapiro_test <- by(df_long$volume, df_long$time, shapiro.test)
shapiro_test
df_long$time: t1

    Shapiro-Wilk normality test

data:  dd[x, ]
W = 0.97894, p-value = 0.7966

------------------------------------------------------------ 
df_long$time: t2

    Shapiro-Wilk normality test

data:  dd[x, ]
W = 0.98662, p-value = 0.9614

------------------------------------------------------------ 
df_long$time: t3

    Shapiro-Wilk normality test

data:  dd[x, ]
W = 0.98085, p-value = 0.8478
Show the code
# 成对比较
emmeans_results <- emmeans(anova_results, ~ group | time)

pairwise_comparisons <- contrast(emmeans_results, method = "pairwise")
pairwise_comparisons
time = t1:
 contrast estimate    SE df t.ratio p.value
 A - B      -0.134 0.436 27  -0.307  0.9494
 A - C       0.499 0.436 27   1.144  0.4958
 B - C       0.633 0.436 27   1.452  0.3296

time = t2:
 contrast estimate    SE df t.ratio p.value
 A - B       0.662 0.763 27   0.867  0.6653
 A - C       0.201 0.763 27   0.263  0.9626
 B - C      -0.461 0.763 27  -0.604  0.8192

time = t3:
 contrast estimate    SE df t.ratio p.value
 A - B       0.486 0.380 27   1.278  0.4192
 A - C      -0.190 0.380 27  -0.500  0.8720
 B - C      -0.676 0.380 27  -1.778  0.1959

P value adjustment: tukey method for comparing a family of 3 estimates 
Show the code
emmeans(anova_results, pairwise ~  time| group)
$emmeans
group = A:
 time emmean    SE df lower.CL upper.CL
 t1     8.07 0.308 27     7.44     8.71
 t2     7.64 0.540 27     6.54     8.75
 t3     3.12 0.269 27     2.57     3.67

group = B:
 time emmean    SE df lower.CL upper.CL
 t1     8.21 0.308 27     7.58     8.84
 t2     6.98 0.540 27     5.88     8.09
 t3     2.64 0.269 27     2.09     3.19

group = C:
 time emmean    SE df lower.CL upper.CL
 t1     7.58 0.308 27     6.94     8.21
 t2     7.44 0.540 27     6.34     8.55
 t3     3.31 0.269 27     2.76     3.86

Confidence level used: 0.95 

$contrasts
group = A:
 contrast estimate    SE df t.ratio p.value
 t1 - t2     0.431 0.659 27   0.653  0.7923
 t1 - t3     4.952 0.350 27  14.138  <.0001
 t2 - t3     4.521 0.677 27   6.676  <.0001

group = B:
 contrast estimate    SE df t.ratio p.value
 t1 - t2     1.226 0.659 27   1.859  0.1700
 t1 - t3     5.572 0.350 27  15.908  <.0001
 t2 - t3     4.345 0.677 27   6.417  <.0001

group = C:
 contrast estimate    SE df t.ratio p.value
 t1 - t2     0.132 0.659 27   0.200  0.9781
 t1 - t3     4.262 0.350 27  12.170  <.0001
 t2 - t3     4.130 0.677 27   6.099  <.0001

P value adjustment: tukey method for comparing a family of 3 estimates 
Show the code
# 使用 car::Anova 进行多变量检验
anova_results <- Anova(lm_aov2, idata = data.frame(time = factor(c("t1", "t2", "t3"))), idesign = ~time, type = "III")

summary(anova_results, multivariate = TRUE)  # 查看多变量检验结果

Type III Repeated Measures MANOVA Tests:

------------------------------------------
 
Term: (Intercept) 

 Response transformation matrix:
   (Intercept)
t1           1
t2           1
t3           1

Sum of squares and products for the hypothesis:
            (Intercept)
(Intercept)    10084.06

Multivariate Tests: (Intercept)
                 Df test stat approx F num Df den Df     Pr(>F)    
Pillai            1   0.99047 2804.768      1     27 < 2.22e-16 ***
Wilks             1   0.00953 2804.768      1     27 < 2.22e-16 ***
Hotelling-Lawley  1 103.88030 2804.768      1     27 < 2.22e-16 ***
Roy               1 103.88030 2804.768      1     27 < 2.22e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

------------------------------------------
 
Term: group 

 Response transformation matrix:
   (Intercept)
t1           1
t2           1
t3           1

Sum of squares and products for the hypothesis:
            (Intercept)
(Intercept)    5.136233

Multivariate Tests: group
                 Df test stat  approx F num Df den Df  Pr(>F)
Pillai            2 0.0502517 0.7142929      2     27 0.49856
Wilks             2 0.9497483 0.7142929      2     27 0.49856
Hotelling-Lawley  2 0.0529106 0.7142929      2     27 0.49856
Roy               2 0.0529106 0.7142929      2     27 0.49856

------------------------------------------
 
Term: time 

 Response transformation matrix:
   time1 time2
t1     1     0
t2     0     1
t3    -1    -1

Sum of squares and products for the hypothesis:
         time1    time2
time1 728.6962 640.5426
time2 640.5426 563.0533

Multivariate Tests: time
                 Df test stat approx F num Df den Df     Pr(>F)    
Pillai            1  0.957481 292.7437      2     26 < 2.22e-16 ***
Wilks             1  0.042519 292.7437      2     26 < 2.22e-16 ***
Hotelling-Lawley  1 22.518744 292.7437      2     26 < 2.22e-16 ***
Roy               1 22.518744 292.7437      2     26 < 2.22e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

------------------------------------------
 
Term: group:time 

 Response transformation matrix:
   time1 time2
t1     1     0
t2     0     1
t3    -1    -1

Sum of squares and products for the hypothesis:
         time1     time2
time1 8.577925 1.4740641
time2 1.474064 0.7659704

Multivariate Tests: group:time
                 Df test stat approx F num Df den Df   Pr(>F)  
Pillai            2 0.2188202 1.658492      4     54 0.173119  
Wilks             2 0.7821070 1.699761      4     52 0.164141  
Hotelling-Lawley  2 0.2774120 1.733825      4     50 0.157273  
Roy               2 0.2730708 3.686456      2     27 0.038414 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Univariate Type III Repeated-Measures ANOVA Assuming Sphericity

            Sum Sq num Df Error SS den Df   F value Pr(>F)    
(Intercept) 3361.4      1   32.358     27 2804.7682 <2e-16 ***
group          1.7      2   32.358     27    0.7143 0.4986    
time         434.1      2   91.436     54  128.1956 <2e-16 ***
group:time     5.2      4   91.436     54    0.7746 0.5465    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


Mauchly Tests for Sphericity

           Test statistic   p-value
time               0.5916 0.0010873
group:time         0.5916 0.0010873


Greenhouse-Geisser and Huynh-Feldt Corrections
 for Departure from Sphericity

            GG eps Pr(>F[GG])    
time       0.71002  9.551e-16 ***
group:time 0.71002      0.509    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

              HF eps   Pr(>F[HF])
time       0.7381052 2.805049e-16
group:time 0.7381052 5.132499e-01