Dynamic survival prediction for multivariate joint models using the R package joineRML

Introduction Model Estimation Software & Example revisited Prediction References Dynamic
survival prediction for multivariate joint models using the R package joineRML Graeme L. Hickey1, Pete Philipson2, Andrea Jorgensen1, Ruwanthi Kolamunnage-Dona1 1Department of Biostatistics, University of Liverpool, UK 2Mathematics, Physics and Electrical Engineering, University of Northumbria, UK [email protected] 17th December 2017 GL. Hickey Joint modelling of multivariate data 1 / 28

Introduction Model Estimation Software & Example revisited Prediction References Preamble
Funding Grant MR/M013227/1 Slides Available from www.glhickey.com GL. Hickey Joint modelling of multivariate data 2 / 28

Introduction Model Estimation Software & Example revisited Prediction References Joint
modelling Covariate data Longitudinal outcomes data Random effects GL. Hickey Joint modelling of multivariate data 3 / 28

modelling Covariate data Time-to-event outcomes data Frailties GL. Hickey Joint modelling of multivariate data 3 / 28

modelling Covariate data Time-to-event outcomes data Frailties Longitudinal outcomes data Random effects GL. Hickey Joint modelling of multivariate data 3 / 28

modelling Typically used for inference, but growing interest in using for dynamic prediction Typically used with a single longitudinal outcome, but growing interest in using multiple (or multivariate) longitudinal outcomes We will describe a recent –package that resolves these two issues GL. Hickey Joint modelling of multivariate data 4 / 28

Introduction Model Estimation Software & Example revisited Prediction References Clinical
example Consider a subset of 154 patients with primary biliary cirrhosis (PBC) randomized to placebo treatment from Mayo Clinic trial (Murtaugh et al. 1994) Multiple biomarkers repeatedly measured at intermittent times, of which we consider 3 clinically relevant ones: 1 serum bilirubin (mg/dl) 2 serum albumin (mg/dl) 3 prothrombin time (seconds) GL. Hickey Joint modelling of multivariate data 5 / 28

Introduction Model Estimation Software & Example revisited Prediction References Prothrombin
time (0.1 x seconds)−4 Albumin (mg dL) Serum bilirubin (loge mg dL) 0 5 10 0 5 10 0 5 10 0.0 0.5 1.0 1.5 2 4 6 8 −2 0 2 4 Time from registration (years) Alive Dead The natural disease progression of untreated patients with PBC enrolled into the trial GL. Hickey Joint modelling of multivariate data 6 / 28

Introduction Model Estimation Software & Example revisited Prediction References 0
2 4 6 8 10 12 14 0.0 0.2 0.4 0.6 0.8 1.0 Time from registration (years) Survival probability nevents = 69, 44.8% GL. Hickey Joint modelling of multivariate data 7 / 28

Introduction Model Estimation Software & Example revisited Prediction References Data
For each subject i = 1, . . . , n, we observe yi = (yi1 , . . . , yiK ) is a K-variate continuous outcome vector, where each yik denotes an (nik × 1)-vector of observed longitudinal measurements for the k-th outcome type: yik = (yi1k, . . . , yinik k) Observation times tijk for j = 1, . . . , nik, which can diﬀer between subjects and outcomes (Ti , δi ), where Ti = min(T∗ i , Ci ), where T∗ i is the true event time, Ci corresponds to a potential right-censoring time, and δi is the failure indicator equal to 1 if the failure is observed (T∗ i ≤ Ci ) and 0 otherwise GL. Hickey Joint modelling of multivariate data 8 / 28

Introduction Model Estimation Software & Example revisited Prediction References Longitudinal
sub-model We extend the univariate joint model from Henderson et al. (2000): yik(t) = xik (t)βk + zik (t)bik + εik(t), with k = 1, . . . , K where xik(t) is a pk-vector of (possibly) time-varying covariates with corresponding fixed effect terms βk bik ∼ N(0, Dkk) subject-and-marker specific random effects cov(bik, bil ) = Dkl for k = l εik(t) is the model error term, which is i.i.d. N(0, σ2 k ) and independent of bik GL. Hickey Joint modelling of multivariate data 9 / 28

Introduction Model Estimation Software & Example revisited Prediction References Time-to-event
sub-model λi (t) = lim dt→0 P(t ≤ Ti < t + dt | Ti ≥ t) dt = λ0(t) exp vi (t)γv + Wi (t) , where λ0(·) is an unspecified baseline hazard function vi (t) is a q-vector of (possibly) time-varying covariates with corresponding fixed effect terms γv Wi (t) = K k=1 γyk{zik (t)bik} independent of the censoring process GL. Hickey Joint modelling of multivariate data 10 / 28

Introduction Model Estimation Software & Example revisited Prediction References Likelihood
The observed data likelihood is given by n i=1 ∞ −∞ f (yi | bi , θ)f (Ti , δi | bi , θ)f (bi | θ)dbi where θ = (β , vech(D), σ2 1 , . . . , σ2 K , λ0(t), γv , γy ), β = (β1 , . . . , βK ), bi = (bi1 , . . . , biK ) ∼ N(0, D) GL. Hickey Joint modelling of multivariate data 11 / 28

Introduction Model Estimation Software & Example revisited Prediction References Estimation
Estimation of θ done using Monte Carlo Expectation-Maximisation (MCEM) algorithm (treating the random eﬀects as missing data) M-step: closed-form updates for all parameters except γ = (γv , γy ), so use a one-step Newton-Raphson iteration E-step: requires calculating several multidimensional integrals of form E h(bi ) | Ti , δi , yi ; ˆ θ , which can be evaluated by: 1 Monte Carlo integration 2 antithetic variates for variance reduction 3 quasi-Monte Carlo (low-discrepancy deterministic sequences, e.g Sobol sequence) Dynamic update rule for MC size and convergence criterion SE estimation by bootstrap or empirical information matrix approximation GL. Hickey Joint modelling of multivariate data 12 / 28

Introduction Model Estimation Software & Example revisited Prediction References joineRML
+ getVarCov() vcov() fixef() ranef()* AIC() BIC() confint() formula() sampleData() dynSurv()* dynLong()* print() summary() plot() sigma() coef() update() baseHaz() residuals() fitted() logLik() bootSE() Rich collection of associated methods * associated with additional plot methods mjoint() Version 0.4.0 available on CRAN https://cran.r-project.org/web/packages/joineRML/ Developmental version available on GitHub https://github.com/graemeleehickey/joineRML Parallel Computing + GL. Hickey Joint modelling of multivariate data 13 / 28

Introduction Model Estimation Software & Example revisited Prediction References Proposed
model for PBC data Longitudinal sub-model log(serBilir) = (β0,1 + b0i,1 ) + (β1,1 + b1i,1 )year + εij1, albumin = (β0,2 + b0i,2 ) + (β1,2 + b1i,2 )year + εij2, (0.1 × prothrombin)−4 = (β0,3 + b0i,3 ) + (β1,3 + b1i,3 )year + εij3, bi ∼ N6 (0, D), and εijk ∼ N(0, σ2 k ) for k = 1, 2, 3; Time-to-event sub-model λi (t) = λ0 (t) exp {γv age + W2i (t)} , W2i (t) = γbil (b0i,1 + b1i,1 t) + γalb (b0i,2 + b1i,2 t) + γpro (b0i,3 + b1i,3 t). GL. Hickey Joint modelling of multivariate data 14 / 28

Introduction Model Estimation Software & Example revisited Prediction References Example
code data(pbc2) placebo <- subset(pbc2, drug == "placebo") fit.pbc <- mjoint( formLongFixed = list( "bil" = log(serBilir) ˜ year, "alb" = albumin ˜ year, "pro" = (0.1 * prothrombin)ˆ-4 ˜ year), formLongRandom = list( "bil" = ˜ year | id, "alb" = ˜ year | id, "pro" = ˜ year | id), formSurv = Surv(years, status2) ˜ age, data = placebo, timeVar = "year", control = list(tol0 = 0.001, burin = 400)) GL. Hickey Joint modelling of multivariate data 15 / 28

Introduction Model Estimation Software & Example revisited Prediction References Results
Parameter Estimate SE 95% CI β0,1 0.5541 0.0858 (0.3859, 0.7223) β1,1 0.2009 0.0201 (0.1616, 0.2402) β0,2 3.5549 0.0356 (3.4850, 3.6248) β1,2 -0.1245 0.0101 (-0.1444, -0.1047) β0,3 0.8304 0.0212 (0.7888, 0.8719) β1,3 -0.0577 0.0062 (-0.0699, -0.0456) γv 0.0462 0.0151 (0.0166, 0.0759) γbil 0.8181 0.2046 (0.4171, 1.2191) γalb -1.7060 0.6181 (-2.9173, -0.4946) γpro -2.2085 1.6070 (-5.3582, 0.9412) GL. Hickey Joint modelling of multivariate data 16 / 28

Introduction Model Estimation Software & Example revisited Prediction References Results
Eﬀect of multivariate inference over univariate joint model: Parameter Model Estimate 95% CI γbil UV 1.2182 (0.9789, 1.6130) γbil MV 0.8181 (0.4171, 1.2191) γalb UV -3.0770 (-4.4865, -2.3466) γalb MV -1.7060 (-2.9173, -0.4946) γpro UV -7.2078 (-10.5410, -5.3917) γpro MV -2.2085 (-5.3582, 0.9412) UV = univariate joint model (ﬁtted with joineR package); MV = multivariate joint model GL. Hickey Joint modelling of multivariate data 17 / 28

prediction So far we have only discussed inference from joint models Dynamic prediction Can we predict failure probability at time u > t given longitudinal data observed up until time t Utility of multivariate data Do we gain from using multiple longitudinal outcomes? GL. Hickey Joint modelling of multivariate data 18 / 28

prediction For a new subject i = n + 1 with available longitudinal measurements yn+1{t} up to time t, we want to calculate P[T∗ n+1 ≥ u | T∗ n+1 > t, yn+1{t}; θ] = E Sn+1 (u | bn+1; θ) Sn+1 (t | bn+1; θ) , where the expectation is taken with respect to the distribution p(bn+1 | T∗ n+1 > t, yn+1{t}; θ), and Sn+1 (u | bn+1; θ) = exp − u 0 λ0(s) exp{vn+1 (s)γv + Wn+1(s)}ds GL. Hickey Joint modelling of multivariate data 19 / 28

prediction Rizopoulos (2011) proposed two estimators for this: 1. First-order approximation P[T∗ n+1 ≥ u | T∗ n+1 > t, yn+1{t}; θ] ≈ Sn+1 u | ˆ bn+1 ; ˆ θ Sn+1 t | ˆ bn+1 ; ˆ θ , where ˆ bn+1 is the mode of p(bn+1 | T∗ n+1 > t, yn+1{t}; ˆ θ) 2. Simulated scheme 1 Draw θ(l) ∼ N(ˆ θ, V (ˆ θ)) 2 Draw b(l) n+1 ∼ p(bn+1 | T∗ n+1 > t, yn+1{t}; θ(l)) [Metropolis-Hastings] 3 Calculate Sn+1 u | b(l) n+1 ;θ(l) Sn+1 t | b(l) n+1 ;θ(l) 4 Repeat Steps 1–3 l = 2, . . . , L times GL. Hickey Joint modelling of multivariate data 20 / 28

code # New patient nd <- subset(placebo, id == "11") # patient 11 nd <- nd[1:4, ] # take first four follow-up times # First-order prediction (default) pred1 <- dynSurv(fit.pbc, nd) pred1 plot(pred1) # Simulated prediction pred2 <- dynSurv(fit.pbc, nd, type = "simulated", scale = 1.6) pred2 plot(pred2) GL. Hickey Joint modelling of multivariate data 21 / 28

code First order approximation 0.2 0.3 0.4 0.5 0.6 0.7 0.8 data.t$tk[[k]] log, serBilir 3.7 3.8 3.9 4.0 4.1 data.t$tk[[k]] albumin 0.45 0.50 0.55 0.60 0.65 data.t$tk[[k]] ^, (0.1 * prothrombin), −4 0 2 4 6 8 10 12 14 xpts ypts 0.0 0.2 0.4 0.6 0.8 1.0 Time Event−free probability GL. Hickey Joint modelling of multivariate data 22 / 28

code Simulated scheme 0.2 0.3 0.4 0.5 0.6 0.7 0.8 data.t$tk[[k]] log, serBilir 3.7 3.8 3.9 4.0 4.1 data.t$tk[[k]] albumin 0.45 0.50 0.55 0.60 0.65 data.t$tk[[k]] ^, (0.1 * prothrombin), −4 0 2 4 6 8 10 12 14 xpts ypts 0.0 0.2 0.4 0.6 0.8 1.0 Time Event−free probability GL. Hickey Joint modelling of multivariate data 22 / 28

code Simulated scheme + constrained B-spline smoother 0.2 0.3 0.4 0.5 0.6 0.7 0.8 data.t$tk[[k]] log, serBilir 3.7 3.8 3.9 4.0 4.1 data.t$tk[[k]] albumin 0.45 0.50 0.55 0.60 0.65 data.t$tk[[k]] ^, (0.1 * prothrombin), −4 0 2 4 6 8 10 12 14 xpts ypts 0.0 0.2 0.4 0.6 0.8 1.0 Time Event−free probability GL. Hickey Joint modelling of multivariate data 22 / 28

Introduction Model Estimation Software & Example revisited Prediction References Truly
dynamic prediction GL. Hickey Joint modelling of multivariate data 23 / 28 0.20 0.25 0.30 0.35 0.40 0.45 data.t$tk[[k]] log, serBilir 2.5 3.0 3.5 4.0 4.5 5.0 5.5 data.t$tk[[k]] albumin 0.3 0.4 0.5 0.6 data.t$tk[[k]] ^, (0.1 * prothrombin), −4 0 0 2 4 6 8 10 12 14 xpts ypts 0.0 0.2 0.4 0.6 0.8 1.0 Time Event−free probability

Introduction Model Estimation Software & Example revisited Prediction References Other
software for dynamic prediction R (JM): univariate joint models (spline model for λ0(t)) 0 2 4 6 8 10 12 14 0.0 0.2 0.4 0.6 0.8 1.0 Univariate joint model {B} Time Pr(Ti ≥ u | Ti > t, y ~ i (t)) R package JM joineRML 0 2 4 6 8 10 12 14 0.0 0.2 0.4 0.6 0.8 1.0 Univariate joint model {A} Time Pr(Ti ≥ u | Ti > t, y ~ i (t)) R package JM joineRML 0 2 4 6 8 10 12 14 0.0 0.2 0.4 0.6 0.8 1.0 Univariate joint model {P} Time Pr(Ti ≥ u | Ti > t, y ~ i (t)) R package JM joineRML R (JMbayes): uni- and multi-variate joint models Stata (stjm): univariate joint models GL. Hickey Joint modelling of multivariate data 24 / 28

Introduction Model Estimation Software & Example revisited Prediction References Value
of multiple biomarkers Build all 32 − 1 = 7 joint models: i.e. from trivariate ({B, A, P}) to univariate ({B}, {A}, {P}) For all patients still in the risk set at landmark time (t), predict their failure probability at horizon time (s), i.e. P[t < T∗ i ≤ s + t | T∗ i > t, yi {t}; θ] Calculate concordance statistic1 (Therneau and Watson 2015) between prediction and whether subject had event in (t, s + t] 1Using the survConcordance() function in the R survival package GL. Hickey Joint modelling of multivariate data 25 / 28

Introduction Model Estimation Software & Example revisited Prediction References Ensemble
of joint models (patient #11) GL. Hickey Joint modelling of multivariate data 26 / 28 0 2 4 6 8 10 12 14 0.0 0.2 0.4 0.6 0.8 1.0 Visit 1 (t = 0) Time (years) Predicted survival {B, A, P} {B, A} {B, P} {A, P} {B} {A} {P}

Introduction Model Estimation Software & Example revisited Prediction References Discrimination
(concordance) q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 0 2 4 1 2 4 8 Landmark time (t; years) Horizon time (s; years) 0.6 0.7 0.8 0.9 1.0 0.6 0.7 0.8 0.9 1.0 0.6 0.7 0.8 0.9 1.0 0.6 0.7 0.8 0.9 1.0 Predicted survival Model q q q q q q q {B, A, P} {B, A} {B, P} {A, P} {B} {A} {P} GL. Hickey Joint modelling of multivariate data 27 / 28

Introduction Model Estimation Software & Example revisited Prediction References References
Questions? Murtaugh, Paul A et al. (1994). Primary biliary cirrhosis: prediction of short-term survival based on repeated patient visits. Hepatology 20(1), pp. 126–134. Henderson, R et al. (2000). Joint modelling of longitudinal measurements and event time data. Biostatistics 1(4), pp. 465–480. Rizopoulos, Dimitris (2011). Dynamic predictions and prospective accuracy in joint models for longitudinal and time-to-event data. Biometrics 67(3), pp. 819–829. Therneau, Terry M and Watson, David A (2015). The concordance statistic and the Cox model. Tech. rep. Rochester, Minnesota: Department of Health Science Research, Mayo Clinic, Technical Report Series No. 85. GL. Hickey Joint modelling of multivariate data 28 / 28

Dynamic survival prediction for multivariate jo...

Dynamic survival prediction for multivariate joint models using the R package joineRML

Graeme Hickey

More Decks by Graeme Hickey

Other Decks in Research

Featured

Transcript

Introduction Model Estimation Software & Example revisited Prediction References Dynamic

Introduction Model Estimation Software & Example revisited Prediction References Preamble

Introduction Model Estimation Software & Example revisited Prediction References Joint

Introduction Model Estimation Software & Example revisited Prediction References Joint

Introduction Model Estimation Software & Example revisited Prediction References Joint

Introduction Model Estimation Software & Example revisited Prediction References Joint

Introduction Model Estimation Software & Example revisited Prediction References Joint

Introduction Model Estimation Software & Example revisited Prediction References Clinical

Introduction Model Estimation Software & Example revisited Prediction References Prothrombin

Introduction Model Estimation Software & Example revisited Prediction References 0

Introduction Model Estimation Software & Example revisited Prediction References Data

Introduction Model Estimation Software & Example revisited Prediction References Longitudinal

Introduction Model Estimation Software & Example revisited Prediction References Time-to-event

Introduction Model Estimation Software & Example revisited Prediction References Likelihood

Introduction Model Estimation Software & Example revisited Prediction References Estimation

Introduction Model Estimation Software & Example revisited Prediction References joineRML

Introduction Model Estimation Software & Example revisited Prediction References Proposed

Introduction Model Estimation Software & Example revisited Prediction References Example

Introduction Model Estimation Software & Example revisited Prediction References Results

Introduction Model Estimation Software & Example revisited Prediction References Results

Introduction Model Estimation Software & Example revisited Prediction References Dynamic

Introduction Model Estimation Software & Example revisited Prediction References Dynamic

Introduction Model Estimation Software & Example revisited Prediction References Dynamic

Introduction Model Estimation Software & Example revisited Prediction References Example

Introduction Model Estimation Software & Example revisited Prediction References Example

Introduction Model Estimation Software & Example revisited Prediction References Example

Introduction Model Estimation Software & Example revisited Prediction References Example

Introduction Model Estimation Software & Example revisited Prediction References Truly

Introduction Model Estimation Software & Example revisited Prediction References Other

Introduction Model Estimation Software & Example revisited Prediction References Value

Introduction Model Estimation Software & Example revisited Prediction References Ensemble

Introduction Model Estimation Software & Example revisited Prediction References Discrimination

Introduction Model Estimation Software & Example revisited Prediction References References