18 Cox比例风险模型

Modified

November 20, 2024

Cox比例风险模型是一种半参数方法

18.1 风险函数

\[ \begin{aligned} h(t)=&\lim_{\Delta t\to 0}\frac{P(t\le T<t+\Delta t|T\ge t)}{\Delta t}\\ =&\lim_{\Delta t\to 0}\frac{P(t\le T<t+\Delta t\ \&\ T\ge t)}{\Delta t·P(T\ge t) }\\ =&\lim_{\Delta t\to 0}\frac{S(t)- S(t+\Delta t)}{\Delta t·S(t)}\\ =&-\frac{d(\ln S(t))}{dt} \end{aligned} \]

推导出 \(S(t)=e^{-\int_0^t h(u)du}\)

\[ \begin{aligned} h(t)\Delta t=&P(t\le T<t+\Delta t|T\ge t) =\frac{P(t\le T<t+\Delta t\ \&\ T\ge t)}{P(T\ge t) }\\ =&\frac{P(t\le T<t+\Delta t)}{P(T\ge t) }\\ =&\frac{f(t)\Delta t}{S(t)} \end{aligned} \]

推导出 \(f(t)=h(t)S(t)\)

18.2 风险率

对于有风险因子\(x_1,x_2,...,x_k\) 的个体在时间 t 的风险率\(h(t|x_1,x_2,...,x_k)\)

\[ h(t|x_1,x_2,...,x_k)=h_0(t)g(x_1,x_2,...,x_k)=h_0(t)exp(\sum_{j=1}^k\beta_jx_j) \]

其中

\(h0 (t)\)是给定所有风险因子（协变量）为零的随时间变化的基线风险函数。
\(g(X)\)是k个独立风险因子的集合函数，代表变量的风险效应。
\(β_j\)是部分回归系数，表示风险比的比例变化。

18.3 风险比（hazard ratio）

假设有两个个体，分别具有独立变量，两个个体的风险函数之比称为风险比

\[ HR=\frac{h(t|x_1,x_2,...,x_k)}{h(t|x_1^*,x_2^*,...,x_k^*)}=exp(\sum_{j=1}^k\beta_j(x_j-x_j^*)) \]

18.3.1 比例风险假设（proportional hazards assumption）

Cox 模型假设任意两组之间的 HR 随时间保持不变

\[ \frac{h(t)}{h_0(t)}=exp(\sum_{j=1}^k\beta_jx_j) \]

18.3.2 模型系数的估计

条件死亡概率和局部似然函数方法

\[ \ln L_p(\beta)=\sum_{i=1}^{d}\left[ \sum_{j=1}^k\beta_jx_{ij}-\ln\sum_{m\in R_i}exp( \sum_{j=1}^k\beta_jx_{mj}) \right] \]

Newton-Raphson iterative method

\[ \begin{cases} \frac{\partial \ln L_p(\beta)}{\partial \beta_1}=0\\ \frac{\partial \ln L_p(\beta)}{\partial \beta_2}=0\\ \vdots\\ \frac{\partial \ln L_p(\beta)}{\partial \beta_k}=0\\ \end{cases} \]

18.3.3 模型系数的假设检验

Wald‘s test

检验是否有独立变量需要被消除，统计量\(Z=b_j/S_{b_j}\)

当样本量足够大时，Z服从标准正态分布，Z² 服从自由度为1 的\(\chi^2\) 分布

\[ \chi^2_W=(b_j/S_{b_j})^2\sim \chi^2(1) \]
Partial likelihood Ratio test

主要用于非显著性变量的消除，新变量的引入和模型的比较。

\[ \chi^2_{LR}=2\left[ \ln L_p(\beta_k)-\ln L_p(\beta_{k-1}) \right]\sim\chi^2(1) \]

其中分别是包含 k 个和 k-1 个（不包含要检验的第 j 个变量）独立变量的对数局部似然函数

18.4 示例

Show the code

library(survminer)
library(survival)
df <- survival::rotterdam
df <- df %>% mutate(dtime_yrs = dtime/365.25,
                    status = death)

# 拟合Cox比例风险模型 
cox_model <- coxph(Surv(dtime_yrs, status) ~ hormon + chemo + size + er + pgr + nodes + meno + grade + age, data = df)
# 查看模型结果 
summary(cox_model)  
#> Call:
#> coxph(formula = Surv(dtime_yrs, status) ~ hormon + chemo + size + 
#>     er + pgr + nodes + meno + grade + age, data = df)
#> 
#>   n= 2982, number of events= 1272 
#> 
#>                 coef  exp(coef)   se(coef)      z Pr(>|z|)    
#> hormon    -6.553e-02  9.366e-01  8.840e-02 -0.741 0.458535    
#> chemo      5.032e-02  1.052e+00  8.198e-02  0.614 0.539342    
#> size20-50  4.425e-01  1.557e+00  6.536e-02  6.771 1.28e-11 ***
#> size>50    8.222e-01  2.276e+00  9.142e-02  8.993  < 2e-16 ***
#> er        -5.512e-05  9.999e-01  1.107e-04 -0.498 0.618466    
#> pgr       -3.676e-04  9.996e-01  1.226e-04 -2.998 0.002720 ** 
#> nodes      7.295e-02  1.076e+00  4.879e-03 14.953  < 2e-16 ***
#> meno       7.046e-02  1.073e+00  1.006e-01  0.701 0.483583    
#> grade      3.156e-01  1.371e+00  7.082e-02  4.456 8.33e-06 ***
#> age        1.406e-02  1.014e+00  3.830e-03  3.671 0.000242 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#>           exp(coef) exp(-coef) lower .95 upper .95
#> hormon       0.9366     1.0677    0.7876    1.1137
#> chemo        1.0516     0.9509    0.8955    1.2349
#> size20-50    1.5567     0.6424    1.3695    1.7694
#> size>50      2.2755     0.4395    1.9022    2.7221
#> er           0.9999     1.0001    0.9997    1.0002
#> pgr          0.9996     1.0004    0.9994    0.9999
#> nodes        1.0757     0.9296    1.0654    1.0860
#> meno         1.0730     0.9320    0.8810    1.3068
#> grade        1.3711     0.7293    1.1934    1.5753
#> age          1.0142     0.9860    1.0066    1.0218
#> 
#> Concordance= 0.693  (se = 0.008 )
#> Likelihood ratio test= 524.4  on 10 df,   p=<2e-16
#> Wald test            = 609.4  on 10 df,   p=<2e-16
#> Score (logrank) test = 688.1  on 10 df,   p=<2e-16

例如，与未接受激素治疗的患者相比，在任何给定时间接受激素治疗患者的结局（死亡）概率为0.9366。换句话说，他们的生存率提高了6.34%。对于每个组，我们比较了该特征的存在（=1）和不存在（=0）。例如，对于激素治疗，我们有2600名患者没有接受激素治疗，而300名患者接受了激素治疗。因此，系数是指接受治疗与不接受治疗的对数风险率的变化，换句话说，“不接受激素治疗”组是我们的参考。

在解释 Cox regresison 的结果之前，请验证是否遵循比例风险假设。

Cox 模型假设任意两组之间的 HR 随时间保持不变。我们可以使用 cox.zph() 非常轻松地对其进行测试。

Show the code

test <- survival::cox.zph(cox_model)
test
#>         chisq df       p
#> hormon  0.684  1 0.40818
#> chemo   2.100  1 0.14727
#> size    5.206  2 0.07405
#> er     59.748  1 1.1e-14
#> pgr    41.637  1 1.1e-10
#> nodes   4.073  1 0.04358
#> meno    4.284  1 0.03846
#> grade   3.163  1 0.07533
#> age    13.954  1 0.00019
#> GLOBAL 93.751 10 9.6e-16

基于Schoenfeld residuals 的比例风险假设

Show the code

# 绘制每个协变量随时间变化的 Schoenfeld 残差
survminer::ggcoxzph(test, point.size = 0.1)

如果残差随时间显示清晰的模式，则可能表示违反了比例风险假设。

一些有助于解释的提示：

无模式（常数残差）：如果残差随机分布在零附近，没有明确的趋势或模式，则表明比例风险假设是合理的：）
线性趋势：残差随时间变化的线性趋势（增加或减少）可能表明违反了比例风险假设。例如，如果残差在一段时间内始终为正或负，则表示存在时间相关效应。
非线性模式：如果残差表现出非线性模式或特定形状（例如，U 形、V 形），则可能表示偏离比例风险。
并行度：平行度意味着残差的分布和分布在时间上相对恒定。如果残差随时间变宽或变窄，则可能表明违反了假设。

Show the code


###  Concordance index
c_index <- function(data, formula, indices) {
    dat <- data[indices, ]
    
    # 构建回归模型
    fit <- cph(formula = formula, data = dat, x = TRUE, y = TRUE, surv = TRUE)
    
    # 进行预测
    pred <- predict(fit, newdata = dat)
    
    # 计算C-index
   C_index <- 1 - rcorrcens(Surv(dat[[all.vars(formula)[1]]], 
                                   dat[[all.vars(formula)[2]]]) ~ pred)[1]
    
    return(C_index)
}

formula <- Surv(dtime_yrs, status) ~ hormon + chemo + size + er + pgr + nodes + meno + grade + age
library(rms)

c_index(data=df, formula ,indices = 1:nrow(df))
#> [1] 0.6928361

library(boot)
set.seed(1234)
boot_c_index <- boot(data = df,statistic = c_index, R = 200,formula = formula)
boot_c_index 
#> 
#> ORDINARY NONPARAMETRIC BOOTSTRAP
#> 
#> 
#> Call:
#> boot(data = df, statistic = c_index, R = 200, formula = formula)
#> 
#> 
#> Bootstrap Statistics :
#>      original       bias    std. error
#> t1* 0.6928361 0.0009553912 0.007547789

plot(boot_c_index)

Show the code

boot.ci(boot_c_index, conf = 0.95, type = c("norm","basic","perc"))
#> BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
#> Based on 200 bootstrap replicates
#> 
#> CALL : 
#> boot.ci(boot.out = boot_c_index, conf = 0.95, type = c("norm", 
#>     "basic", "perc"))
#> 
#> Intervals : 
#> Level      Normal              Basic              Percentile     
#> 95%   ( 0.6771,  0.7067 )   ( 0.6773,  0.7061 )   ( 0.6795,  0.7083 )  
#> Calculations and Intervals on Original Scale
#> Some basic intervals may be unstable
#> Some percentile intervals may be unstable

boot_c_index$t
#>             [,1]
#>   [1,] 0.6840415
#>   [2,] 0.6861004
#>   [3,] 0.6922284
#>   [4,] 0.6871463
#>   [5,] 0.6891063
#>   [6,] 0.7070723
#>   [7,] 0.6975292
#>   [8,] 0.6957405
#>   [9,] 0.6889738
#>  [10,] 0.6973390
#>  [11,] 0.7001275
#>  [12,] 0.6994050
#>  [13,] 0.6899169
#>  [14,] 0.7005129
#>  [15,] 0.7063317
#>  [16,] 0.6996744
#>  [17,] 0.6979745
#>  [18,] 0.6925786
#>  [19,] 0.6904877
#>  [20,] 0.6965846
#>  [21,] 0.6958925
#>  [22,] 0.6992254
#>  [23,] 0.7059247
#>  [24,] 0.7043914
#>  [25,] 0.6888177
#>  [26,] 0.6971457
#>  [27,] 0.6929253
#>  [28,] 0.6930116
#>  [29,] 0.6951884
#>  [30,] 0.7057137
#>  [31,] 0.6853929
#>  [32,] 0.6878291
#>  [33,] 0.6924952
#>  [34,] 0.6932619
#>  [35,] 0.6920361
#>  [36,] 0.6907561
#>  [37,] 0.6911114
#>  [38,] 0.6837897
#>  [39,] 0.7038525
#>  [40,] 0.7024494
#>  [41,] 0.6897124
#>  [42,] 0.7054827
#>  [43,] 0.7045421
#>  [44,] 0.7031605
#>  [45,] 0.6902170
#>  [46,] 0.6897095
#>  [47,] 0.6982245
#>  [48,] 0.6882279
#>  [49,] 0.6810325
#>  [50,] 0.6928143
#>  [51,] 0.6940493
#>  [52,] 0.7023472
#>  [53,] 0.6977338
#>  [54,] 0.6816434
#>  [55,] 0.6902461
#>  [56,] 0.6962076
#>  [57,] 0.6930478
#>  [58,] 0.6782716
#>  [59,] 0.6888177
#>  [60,] 0.6932389
#>  [61,] 0.6820609
#>  [62,] 0.6891455
#>  [63,] 0.6965728
#>  [64,] 0.7051772
#>  [65,] 0.6970262
#>  [66,] 0.7051638
#>  [67,] 0.6978584
#>  [68,] 0.6954607
#>  [69,] 0.6913879
#>  [70,] 0.6828812
#>  [71,] 0.6921640
#>  [72,] 0.6932000
#>  [73,] 0.7002079
#>  [74,] 0.6968225
#>  [75,] 0.6997555
#>  [76,] 0.6846542
#>  [77,] 0.6846691
#>  [78,] 0.7068665
#>  [79,] 0.6961698
#>  [80,] 0.6963645
#>  [81,] 0.6933080
#>  [82,] 0.6947798
#>  [83,] 0.6944429
#>  [84,] 0.6928754
#>  [85,] 0.6946977
#>  [86,] 0.6844625
#>  [87,] 0.6993367
#>  [88,] 0.7109857
#>  [89,] 0.6899890
#>  [90,] 0.6856668
#>  [91,] 0.6951957
#>  [92,] 0.6840721
#>  [93,] 0.6919932
#>  [94,] 0.6915590
#>  [95,] 0.6860581
#>  [96,] 0.6802692
#>  [97,] 0.6896906
#>  [98,] 0.7020619
#>  [99,] 0.6938913
#> [100,] 0.7052154
#> [101,] 0.6868067
#> [102,] 0.6925506
#> [103,] 0.6875501
#> [104,] 0.6997005
#> [105,] 0.7097947
#> [106,] 0.6950401
#> [107,] 0.6887829
#> [108,] 0.6825710
#> [109,] 0.6914889
#> [110,] 0.6869629
#> [111,] 0.6910353
#> [112,] 0.6850879
#> [113,] 0.7083620
#> [114,] 0.7060274
#> [115,] 0.6890010
#> [116,] 0.7033136
#> [117,] 0.7045074
#> [118,] 0.6968568
#> [119,] 0.7020912
#> [120,] 0.6924852
#> [121,] 0.6938881
#> [122,] 0.6974524
#> [123,] 0.7089484
#> [124,] 0.6941822
#> [125,] 0.6880722
#> [126,] 0.6964263
#> [127,] 0.6950960
#> [128,] 0.6922390
#> [129,] 0.7138132
#> [130,] 0.6951402
#> [131,] 0.6775030
#> [132,] 0.6782926
#> [133,] 0.6832530
#> [134,] 0.6896957
#> [135,] 0.6904713
#> [136,] 0.6950142
#> [137,] 0.7022224
#> [138,] 0.7063290
#> [139,] 0.6871596
#> [140,] 0.7035121
#> [141,] 0.6842875
#> [142,] 0.6830441
#> [143,] 0.7038993
#> [144,] 0.6972607
#> [145,] 0.6899501
#> [146,] 0.7055068
#> [147,] 0.6936249
#> [148,] 0.6920770
#> [149,] 0.7045031
#> [150,] 0.6985115
#> [151,] 0.7020377
#> [152,] 0.6894993
#> [153,] 0.6864729
#> [154,] 0.6890899
#> [155,] 0.6860522
#> [156,] 0.7002668
#> [157,] 0.7009822
#> [158,] 0.6844799
#> [159,] 0.6981335
#> [160,] 0.6831765
#> [161,] 0.6866612
#> [162,] 0.6837701
#> [163,] 0.6952964
#> [164,] 0.6928831
#> [165,] 0.7031870
#> [166,] 0.6984845
#> [167,] 0.6928333
#> [168,] 0.6821514
#> [169,] 0.6815036
#> [170,] 0.6834020
#> [171,] 0.6973542
#> [172,] 0.6860725
#> [173,] 0.7000901
#> [174,] 0.6877945
#> [175,] 0.6954614
#> [176,] 0.6961834
#> [177,] 0.6854653
#> [178,] 0.6795140
#> [179,] 0.6983454
#> [180,] 0.6847524
#> [181,] 0.6978928
#> [182,] 0.7032848
#> [183,] 0.6799797
#> [184,] 0.6947564
#> [185,] 0.7014366
#> [186,] 0.6993279
#> [187,] 0.6962003
#> [188,] 0.7002197
#> [189,] 0.7059244
#> [190,] 0.6784809
#> [191,] 0.6935298
#> [192,] 0.6826279
#> [193,] 0.6945671
#> [194,] 0.6955379
#> [195,] 0.6898856
#> [196,] 0.6906719
#> [197,] 0.6881098
#> [198,] 0.6929627
#> [199,] 0.6880540
#> [200,] 0.6990972

quantile(boot_c_index$t,probs = c(.025,.975))
#>      2.5%     97.5% 
#> 0.6799681 0.7071046

18.5 森林图

Show the code

ggforest(cox_model, data = df)