copied from here
Assessment of Discrimination in Survival Analysis (C-statistics, etc)
References
Discrimination
- Differentiating those who will have events and those who will not have events.
- AUC is an established method for logistic regression (higher probability for cases than for non-cases).
- In survival data, the concept of AUC can be defined in different ways.
Different C-statistics and related measures implemented in R
See individual examples below for links to the original papers.
- Summary measures
- Logistic AUC ignoring time-to-event data
- Harrell’s C or concordance (Hmisc::rcorrcens or survival::survConcordance),
- C-statistic by Begg et al. (survAUC::BeggC)
- C-statistic by Uno et al. (survC1::Inf.Cval; survAUC::UnoC)
- Gonen and Heller Concordance Index for Cox models (survAUC::GHCI, CPE::phcpe, clinfun::coxphCPE)
- Integrated AUC (survAUC::IntAUC for AUC.cd, AUC.hc, AUC.sh, AUC.uno)
- \( R^2 \)-type coefficents (survAUC::OXS, Nagelk, XO)
- IDI, NRI, and median improvement (survIDINRI::IDI.INF)
- .
- Time-dependent measures
- Cumulative case/dynamic control ROC/AUC by Heagerty et al. 2000 (survivalROC::svivalROC)
- Incident case/dynamic control ROC/AUC by Heagerty et al. 2005 (risksetROC::risksetROC)
- Cumulative case/dynamic control AUC by Chambless and Diao 2006 (survAUC::AUC.cd)
- Cumulative case/dynamic control AUC by Hung and Chiang 2010 (survAUC::AUC.hc)
- Incident case or Cumulative case/dynamic control AUC by Song and Zhou 2008 (survAUC::AUC.sh)
- Cumulative case/dynamic control AUC by Uno et al. 2007 (survAUC::AUC.uno)
For the explanation of the difference between cumulative/dynamic AUCs vs incident/dynamic AUCs, see the paper by Heargery et al ( http://www.statmed.medicina.unimib.it/statisticalps2011/materiale/Heagerty%20and%20Zheng,%20Biometrics%202005.pdf ).
As I understand it,
- Cumulative/dynamic AUCs assess discrimination of t-year cumulative incidence (risk),
- Incident/dynamic AUCs discrimination of incidence (not cumulative incidence) at t years among those who survived until that point (risk set).
Thus, the idea behind incident/dynamic AUCs is closer to the idea of hazard (dynamically changing instantaneous incidence at a given time), and it can handle time-varying predictors. But the question answered by cumulative/dynamic AUCs may be more clinically relevant, e.g., does this model discriminate if I will survive next five years .
CRAN Task View: Survival Analysis: Predictions and Prediction Performance
- http://cran.r-project.org/web/views/Survival.html
- The pec package provides utilities to plot prediction error curves for several survival models
- peperr implements prediction error techniques which can be computed in a parallelised way. Useful for high-dimensional data.
- survivalROC computes time-dependent ROC curves and time-dependent AUC from censored data using Kaplan-Meier or Akritas’s nearest neighbour estimation method (Cumulative sensitivity and dynamic specificity).
- risksetROC implements time-dependent ROC curves, AUC and integrated AUC of Heagerty and Zheng (Biometrics, 2005).
- Various time-dependent true/false positive rates and Cumulative/Dynamic AUC are implemented in the survAUC package.
- The survcomp package (Bioconductor) provides several functions to assess and compare the performance of survival models.
- C-statistics for risk prediction models with censored survival data can be computed via the survC1 package.
- The survIDINRI package implements the integrated discrimination improvement index and the category-less net reclassification index for comparing competing risks prediction models.
PBC dataset
age: in years
albumin: serum albumin (g/dl)
alk.phos: alkaline phosphotase (U/liter)
ascites: presence of ascites
ast: aspartate aminotransferase, once called SGOT (U/ml)
bili: serum bilirunbin (mg/dl)
chol: serum cholesterol (mg/dl)
copper: urine copper (ug/day)
edema: 0 no edema, 0.5 untreated or successfully treated
1 edema despite diuretic therapy
hepato: presence of hepatomegaly or enlarged liver
id: case number
platelet: platelet count
protime: standardised blood clotting time
sex: m/f
spiders: blood vessel malformations in the skin
stage: histologic stage of disease (needs biopsy)
status: status at endpoint, 0/1/2 for censored, transplant, dead
time: number of days between registration and the earlier of death,
transplantion, or study analysis in July, 1986
trt: 1/2/NA for D-penicillmain, placebo, not randomised
trig: triglycerides (mg/dl)
Load the PBC dataset and modify for later use
library(survival)
data(pbc)
pbc <- within(pbc, {
event <- as.numeric(status %in% c(1,2))
Surv <- Surv(time, event)
})
Kaplan-Meier plots
par(mar = c(2,2,2,2))
layout(matrix(1:4, byrow = T, ncol = 2))
library(rms)
survplot(survfit(Surv ~ 1, data = pbc))
survplot(survfit(Surv ~ age >= median(age), data = pbc), label.curves = list(method = "arrow", cex = 1.2))
survplot(survfit(Surv ~ sex, data = pbc), label.curves = list(method = "arrow", cex = 1.2))
survplot(survfit(Surv ~ albumin >= median(albumin), data = pbc), label.curves = list(method = "arrow", cex = 1.2))
AUC by logistic regression models
Completely ignore the time variable
It is the simplest method. Logistic regression is used instead of Cox regression model.
Completely ignore the time variable and use the outcome variable as a binary outcome variable. It does not take into acount the variable length of follow-up.
library(epicalc)
logit.age.sex <- glm(event ~ age + sex, data = pbc, family = binomial)
lroc(logit.age.sex, graph = F)$auc
[1] 0.5879
logit.age.sex.albumin <- glm(event ~ age + sex + albumin, data = pbc, family = binomial)
lroc(logit.age.sex.albumin, graph = F)$auc
[1] 0.6751
Cut the follow up at a specifict time point
Only events that occured within two years are considered events and others are treated as non-events. This method can be valid if the specified time is short enough so that there are few censored subjects. Most people have complete follow-up in this situation.
pbc <- within(pbc, {
outcome2yr <- NA
outcome2yr[(event == 1) & (time <= 2 * 365)] <- 1
outcome2yr[(event == 0) | (time > 2 * 365)] <- 0
})
logit.age.sex <- glm(outcome2yr ~ age + sex, data = pbc, family = binomial)
lroc(logit.age.sex, graph = F)$auc
[1] 0.6484
logit.age.sex.albumin <- glm(outcome2yr ~ age + sex + albumin, data = pbc, family = binomial)
lroc(logit.age.sex.albumin, graph = F)$auc
[1] 0.7918
Fit Cox regression models for later use
The linear predictors (lp), when exponetiated, will provide the predicted hazard ratios for individuals. Thus these can be used as the summary predictors calculated from multiple raw predictors.
coxph.null <- coxph(Surv ~ 1, data = pbc)
coxph.null
Call: coxph(formula = Surv ~ 1, data = pbc)
Null model
log likelihood= -1009
n= 418
coxph.age.sex <- coxph(Surv ~ age + sex, data = pbc)
coxph.age.sex
Call:
coxph(formula = Surv ~ age + sex, data = pbc)
coef exp(coef) se(coef) z p
age 0.0221 1.022 0.00727 3.04 0.0024
sexf -0.3000 0.741 0.20975 -1.43 0.1500
Likelihood ratio test=12 on 2 df, p=0.0025 n= 418, number of events= 186
coxph.age.sex.albumin <- coxph(Surv ~ age + sex + albumin, data = pbc)
coxph.age.sex.albumin
Call:
coxph(formula = Surv ~ age + sex + albumin, data = pbc)
coef exp(coef) se(coef) z p
age 0.0148 1.015 0.00749 1.97 4.9e-02
sexf -0.3467 0.707 0.21079 -1.64 1.0e-01
albumin -1.4066 0.245 0.17096 -8.23 2.2e-16
Likelihood ratio test=73.9 on 3 df, p=6.66e-16 n= 418, number of events= 186
anova(coxph.age.sex, coxph.age.sex.albumin, test = "LRT")
Analysis of Deviance Table
Cox model: response is Surv
Model 1: ~ age + sex
Model 2: ~ age + sex + albumin
loglik Chisq Df P(>|Chi|)
1 -1003
2 -972 61.9 1 3.7e-15 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
pbc$lp.null <- predict(coxph.null, type = "lp")
pbc$lp.age.sex <- predict(coxph.age.sex, type = "lp")
pbc$lp.age.sex.albumin <- predict(coxph.age.sex.albumin, type = "lp")
Harrell’s C or concordance
Trying to compare (test) the difference in C is not recommended.
http://stats.stackexchange.com/questions/17480/how-to-do-roc-analysis-in-r-with-a-cox-model/17517#17517
Hmisc::rcorrcens
A larger marker value is considered to be associated with a longer survival by this function. Thus, the linear predictor (the higher, the worse) needs to be negated
library(Hmisc)
rcorrcens(formula = Surv ~ I(-1 * lp.age.sex), data = pbc)
Somers' Rank Correlation for Censored Data Response variable:Surv
C Dxy aDxy SD Z P n
I(-1 * lp.age.sex) 0.569 0.138 0.138 0.043 3.18 0.0015 418
rcorrcens(formula = Surv ~ I(-1 * lp.age.sex.albumin), data = pbc)
Somers' Rank Correlation for Censored Data Response variable:Surv
C Dxy aDxy SD Z P n
I(-1 * lp.age.sex.albumin) 0.703 0.407 0.407 0.041 9.86 0 418
survival package
Actually, the summary method for coxph objects prints “Concordance” (five lines from bottom), which is the same thing as the Harrells’C, and \( R^2 \).
summary(coxph.age.sex)
Call:
coxph(formula = Surv ~ age + sex, data = pbc)
n= 418, number of events= 186
coef exp(coef) se(coef) z Pr(>|z|)
age 0.02210 1.02234 0.00727 3.04 0.0024 **
sexf -0.29995 0.74085 0.20975 -1.43 0.1527
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
exp(coef) exp(-coef) lower .95 upper .95
age 1.022 0.978 1.008 1.04
sexf 0.741 1.350 0.491 1.12
Concordance= 0.569 (se = 0.023 )
Rsquare= 0.028 (max possible= 0.992 )
Likelihood ratio test= 12 on 2 df, p=0.0025
Wald test = 12.2 on 2 df, p=0.0022
Score (logrank) test = 12.3 on 2 df, p=0.00214
summary(coxph.age.sex.albumin)
Call:
coxph(formula = Surv ~ age + sex + albumin, data = pbc)
n= 418, number of events= 186
coef exp(coef) se(coef) z Pr(>|z|)
age 0.01476 1.01487 0.00749 1.97 0.049 *
sexf -0.34672 0.70700 0.21079 -1.64 0.100 .
albumin -1.40664 0.24496 0.17096 -8.23 2.2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
exp(coef) exp(-coef) lower .95 upper .95
age 1.015 0.985 1.000 1.030
sexf 0.707 1.414 0.468 1.069
albumin 0.245 4.082 0.175 0.342
Concordance= 0.703 (se = 0.023 )
Rsquare= 0.162 (max possible= 0.992 )
Likelihood ratio test= 73.9 on 3 df, p=6.66e-16
Wald test = 83.1 on 3 df, p=0
Score (logrank) test = 82.2 on 3 df, p=0
Gonen and Heller Concordance Index for Cox models
- Gonen M, et al. Concordance probability and discriminatory power in proportional hazards regression. Biometrika 2005;92:965-970.
CPE package
library(CPE)
phcpe(coxph.age.sex, CPE.SE = TRUE)
$CPE
[1] 0.5724
$CPE.SE
[1] 0.02023
phcpe(coxph.age.sex.albumin, CPE.SE = TRUE)
$CPE
[1] 0.6615
$CPE.SE
[1] 0.01531
clinfun package
library(clinfun)
coxphCPE(coxph.age.sex)
CPE smooth.CPE se.CPE
0.57244 0.57236 0.02023
coxphCPE(coxph.age.sex.albumin)
CPE smooth.CPE se.CPE
0.66150 0.66128 0.01531
Cumulative case/dynamic control ROC/AUC using survivalROC package (Heagerty, 2000)
library(survivalROC)
fun.survivalROC <- function(lp, t) {
res <- with(pbc,
survivalROC(Stime = time,
status = event,
marker = get(lp),
predict.time = t,
method = "KM"))
with(res, plot(TP ~ FP, type = "l", main = sprintf("t = %.0f, AUC = %.2f", t, AUC)))
abline(a = 0, b = 1, lty = 2)
res
}
layout(matrix(1:10, byrow = T, ncol = 5))
res.survivalROC.age.sex <- lapply(1:10 * 365.25, function(t) {
fun.survivalROC(lp = "lp.age.sex", t)
})
res.survivalROC.age.sex.albumin <- lapply(1:10 * 365.25, function(t) {
fun.survivalROC(lp = "lp.age.sex.albumin", t)
})
Incident case/dynamic control ROC/AUC using risksetROC package (Heagerty, 2005)
- Heagerty PJ, et al., Survival model predictive accuracy and ROC curves., Biometrics. 2005 Mar;61(1):92-105.
- http://www.ncbi.nlm.nih.gov/pubmed/15737082
http://faculty.washington.edu/heagerty/Software/SurvROC/
risksetAUC(): This function creates risksetAUC from a survival data set.
risksetROC(): This function creates risksetROC from a survival data set.
This package calculates the incidence-based time-dependent ROC among the risk set (subpopulation) at time t.
Cases are those who died at time t (incident cases).
Controls are those who survived until time t (dynamic controls)
Use of incident cases rather than cumulative cases allows for assessment of time-dependent predictors.
1- to 10-year ROCs
library(risksetROC)
fun.risksetROC <- function(lp, t) {
res <- with(pbc,
risksetROC(Stime = time,
status = event,
marker = get(lp),
predict.time = t,
plot = FALSE))
with(res, plot(TP ~ FP, type = "l", main = sprintf("t = %.0f, AUC = %.2f", t, AUC)))
abline(a = 0, b = 1, lty = 2)
res
}
layout(matrix(1:10, byrow = T, ncol = 5))
risksetROC.age.sex <- lapply(365.25 * 1:10, function(t) {
fun.risksetROC(lp = "lp.age.sex", t)
})
risksetROC.age.sex.albumin <- lapply(365.25 * 1:10, function(t) {
fun.risksetROC(lp = "lp.age.sex.albumin", t)
})
Up to 10-year AUCs
layout(matrix(1:2, byrow = T, ncol = 2))
risksetAUC.age.sex <- with(pbc,
risksetAUC(Stime = time,
status = event,
marker = lp.age.sex,
tmax = 10 * 365.25))
title(sprintf("t = %.0f, iAUC = %.2f", 10 * 365.25, risksetAUC.age.sex$Cindex))
risksetAUC.age.sex.albumin <- with(pbc,
risksetAUC(Stime = time,
status = event,
marker = lp.age.sex.albumin,
tmax = 10 * 365.25))
title(sprintf("t = %.0f, iAUC = %.2f", 10 * 365.25, risksetAUC.age.sex.albumin$Cindex))
Various methods provided survAUC package (Potapov et al)
These can calculate multiple time-dependent ROC at once, and also compute summary measures of a time-dependent AUC curve (iAUC)
These need a training dataset and a test dataset. The same data can be given to both, and it works although I am not sure if this is correct.
Functions
- AUC.cd(): AUC estimator proposed by Chambless and Diao
- AUC.hc(): AUC estimator proposed by Hung and Chiang
- AUC.sh(): AUC estimator proposed by Song and Zhou
- AUC.uno(): AUC estimator proposed by Uno et al.
- BeggC(): C-statistic by Begg et al.
- GHCI(): Gonen and Heller’s Concordance Index for Cox models
- IntAUC(): Integration of time-dependent AUC curves
- OXS(): R2-type coefficients for Cox proportional hazards models
- plot.survAUC(): Plot method for survAUC and survErr Objects
- predErr(): Distance-based estimators of survival predictive accuracy
- schemper(): Distance-based estimator of survival predictive accuracy proposed by Schemper and Henderson
- UnoC() C-statistic by Uno et al.
Time-dependent AUCs for the age sex model are calculated by various methods. (1- to 10-year AUCs)
library(survAUC)
res.AUC.cd <- AUC.cd(Surv.rsp = pbc$Surv,
Surv.rsp.new = pbc$Surv,
lp = coxph.age.sex$linear.predictors,
lpnew = coxph.age.sex$linear.predictors,
times = 1:10 * 365.25)
res.AUC.cd$iauc
[1] 0.5895
plot(res.AUC.cd)
res.AUC.hc <- AUC.hc(Surv.rsp = pbc$Surv,
Surv.rsp.new = pbc$Surv,
lpnew = coxph.age.sex$linear.predictors,
times = 1:10 * 365.25)
res.AUC.hc$iauc
[1] 0.6005
plot(res.AUC.hc)
res.AUC.sh <- AUC.sh(Surv.rsp = pbc$Surv,
Surv.rsp.new = pbc$Surv,
lp = coxph.age.sex$linear.predictors,
lpnew = coxph.age.sex$linear.predictors,
times = 1:10 * 365.25)
res.AUC.sh$iauc
[1] 0.573
plot(res.AUC.sh)
res.AUC.uno <- AUC.uno(Surv.rsp = pbc$Surv,
Surv.rsp.new = pbc$Surv,
lpnew = coxph.age.sex$linear.predictors,
times = 1:10 * 365.25)
res.AUC.uno$iauc
[1] 0.5811
plot(res.AUC.uno)
Summary measures (10 years when applicable)
res.UnoC <- UnoC(Surv.rsp = pbc$Surv,
Surv.rsp.new = pbc$Surv,
lpnew = coxph.age.sex$linear.predictors,
time = 10 * 365.25)
res.UnoC
[1] 0.5608
res.BeggC <- BeggC(Surv.rsp = pbc$Surv,
Surv.rsp.new = pbc$Surv,
lp = coxph.age.sex$linear.predictors,
lpnew = coxph.age.sex$linear.predictors)
res.BeggC
[1] 0.5519
res.GHCI <- GHCI(
lpnew = coxph.age.sex$linear.predictors)
res.GHCI
[1] 0.5717
*R2-type Coefficients for Cox models *
res.OXS <- OXS(Surv.rsp = pbc$Surv,
lp = coxph.age.sex$linear.predictors,
lp0 = coxph.null$linear.predictors)
res.OXS
[1] 0.06092
res.Nagelk <- Nagelk(Surv.rsp = pbc$Surv,
lp = coxph.age.sex$linear.predictors,
lp0 = coxph.null$linear.predictors)
res.Nagelk
[1] 0.02779
res.XO <- XO(Surv.rsp = pbc$Surv,
lp = coxph.age.sex$linear.predictors,
lp0 = coxph.null$linear.predictors)
res.XO
[1] 0.04166
Uno methods for C-statistics and IDI/NRI implemented in survC1 and survIDINRI
pbc$female <- as.numeric(pbc$sex == "f")
C-statistics (10-years follow up) using survC1 package
- Hajime Uno, Tianxi Cai, Michael J. Pencina, Ralph B. DAgostino, and LJ Wei. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Statistics in Medicine 2011, 30:1105-16.
library(survC1)
unoC.age.female <- Est.Cval(mydata = pbc[,c("time","event","age","female")], tau = 10 * 365.25)
unoC.age.female$Dhat
[1] 0.5618
unoC.age.female.albumin <- Est.Cval(mydata = pbc[,c("time","event","age","female","albumin")], tau = 10 * 365.25)
unoC.age.female.albumin$Dhat
[1] 0.672
Comparison of C-statistics
mydata: Input data. The 1st column should be time-to-event, and the
2nd column is event indicator (1=event, 0=censor).
covs0: A matrix that consists of a set of predictors for a base
model (Model 0)
covs1: A matrix that consists of a set of predictors for a new model
(Model 1)
tau: Truncation time. The resulting C tells how well the given
prediction model works in predicting events that occur in the
time range from 0 to ‘tau’. Note that the survival function
for the underlying censoring time distribution needs to be
positive at ‘tau’.
itr: Iteration of perturbation-resampling.
uno.C.delta <- Inf.Cval.Delta(mydata = pbc[,c("time","event")],
covs0 = pbc[,c("age","female")],
covs1 = pbc[,c("age","female","albumin")],
tau = 10 * 365.25,
itr = 1000)
uno.C.delta
Est SE Lower95 Upper95
Model1 0.6720 0.02137 0.63009 0.7138
Model0 0.5618 0.02266 0.51736 0.6061
Delta 0.1102 0.02367 0.06384 0.1566
IDI, continous NRI, and median improvement (10-years follow up) using survIDINRI
- Uno H, Tian L, Cai T, Kohane IS, Wei LJ. A unified inference procedure for a class of measures to assess improvement in risk prediction systems with survival data, Statistics in Medicine 2012. doi:10.1002/sim.5647
library(survIDINRI)
res.IDI.INF <- IDI.INF(indata = pbc[,c("time","event")],
covs0 = pbc[,c("age","female")],
covs1 = pbc[,c("age","female","albumin")],
t0 = 10 * 365.25,
npert = 300, npert.rand = NULL, seed1 = NULL, alpha = 0.05)
IDI.INF.OUT(res.IDI.INF)
Est. Lower Upper
M1 0.101 0.027 0.162
M2 0.254 0.015 0.453
M3 0.094 0.018 0.180
IDI.INF.GRAPH(res.IDI.INF)