Slide 1

Slide 1 text

Multiple Imputation of Non-Proportional Hazards Survival Data Anna France Lancaster University July 4, 2017 Anna France Multiple Imputation July 4, 2017 1 / 27

Slide 2

Slide 2 text

Introduction Motivation: Examine factors affecting survival post-stroke, accounting for missing data appropriately. Aims: Develop a multiple imputation framework, building upon the Multiple Imputation using Chained Equations (MICE) approach. Adjust the multiple imputation framework to account for non-proportional hazards survival data. Anna France Multiple Imputation July 4, 2017 2 / 27

Slide 3

Slide 3 text

Overview of Data The data includes 538 patients, identified as having had a stroke and entered onto the stroke register at University Hospital Aintree or the Royal Liverpool University Hospitals, Broadgreen, between January 1st and June 11th 1996. The data set comprises of baseline measurements from the time of stroke and several outcome measures assessed at discharge and during follow-up at 3, 6, 12, 24, 36, 48 and 60 months post-stroke. Baseline measures: Patient characteristics; e.g. age, gender, smoking status, pre-stroke mobility, previous medical conditions, Stroke event assessments; e.g. admission date, worst conscious level, side of lesion, blood pressure, CT scan results. Outcome measures: Date and cause of death, Activities (Barthel Index), participation (Rankin) and mood (Yale), Resource use; e.g. length of hospital stay, medication, re-admissions. Anna France Multiple Imputation July 4, 2017 3 / 27

Slide 4

Slide 4 text

Missing Data Missing data is common issue in observational data. Missing data causes two key problems; bias and loss of efficiency. Missing data has the potential to undermine the validity of research results. Risk of bias, due to missing data, depends on the reasons why the data is missing. Anna France Multiple Imputation July 4, 2017 4 / 27

Slide 5

Slide 5 text

Missing Data Mechanisms Let Mi be the indicator of response, Xi denote the data, partitioned into Xo i and Xm i for observed and missing respectively, and let Zi denote a matrix of covariates associated with Xi . Reasons for missing data can be classified by the following mechanisms: Missing Completely at Random (MCAR); Pr(Mi |Xi , Zi ) = Pr(Mi ) Missing at Random (MAR); Pr(Mi |Xi , Zi ) = Pr(Mi |Xo i , Zi ) Not Missing at Random (NMAR); Pr(Mi |Xi , Zi ) = Pr(Mi |Xo i , Xm i , Zi ) Anna France Multiple Imputation July 4, 2017 5 / 27

Slide 6

Slide 6 text

Multiple Imputation Multiple imputation (MI) is a relatively flexible, general approach to missing data. MI involves creating multiple copies of the data set, replacing missing values with imputed values. The imputed values are sampled from their predictive distribution, given the observed data. The imputation procedure must fully account for all uncertainty in the missing values through injecting appropriate variability into the imputed values (Sterne et al., 2009). Anna France Multiple Imputation July 4, 2017 6 / 27

Slide 7

Slide 7 text

Multiple Imputation using Chained Equations (MICE) Suppose we have an outcome of interest Y , partially observed variables X1, X2, ..., Xp, and fully observed covariates Z. Specify a substantive model for f (Y |X1, ..., Xp, α), with parameters α. Assume throughout that the substantive model is correctly specified and the missing at random assumption holds for the missing values. Specify univariate models for each partially observed variable, conditional on all the other variables: f (Xj |X−j , Z, Y , θj ), j = 1, ..., p. Missing values are imputed in Xj , conditional on observed values and most recent imputation of X−j and Z, Y . Cycle through each of the partially observed variables, imputing from each univariate model, in a Gibbs sampling approach. Each univariate model can be of a different type, so MICE is appealing for data sets with mixtures of continuous and categorical variables (Bartlett et al., 2015). Anna France Multiple Imputation July 4, 2017 7 / 27

Slide 8

Slide 8 text

MICE Algorithm For imputation k = 1, ..., K: 1 Impute missing values of X initially using an ad-hoc approach. 2 For iteration n = 1, ..., N; 1 Impute from f (X1|X−1, Z, Y , θ1 ); Fit model f (X1|X−1, Z, Y , θ1 ) using subjects for whom X1 was observed. Draw θ(n) 1 from posterior for θ1 corresponding to this fit. Impute missing values in X1 once from f (X1|X−1, Z, Y , θ(n) 1 ). 2 Impute from f (X2|X−1, Z, Y , θ2 ) 3 ... 4 Impute from f (Xp|X−1, Z, Y , θp ) 3 Repeat 2, cycling over iterations 4 Use current imputed values of missing values to form k-th imputed dataset. Anna France Multiple Imputation July 4, 2017 8 / 27

Slide 9

Slide 9 text

Analysis of Multiply Imputed Data Standard statistical methods can be used to fit the substantive model of interest to each of the imputed datasets. The estimated associations will differ between each of the imputed datasets due to the variation introduced by the MI procedure. The estimated associations and standard errors can be combined using Rubin’s rule to obtain useful and valid inferences. Anna France Multiple Imputation July 4, 2017 9 / 27

Slide 10

Slide 10 text

Rubin’s Rules Let β denote the vector of parameters in the substantive model. Suppose ˆ βk is an estimate of a univariate or multivariate quantity of interest obtained from the k-th imputed data set j and Wk is the estimated variance of ˆ βk. The combined estimate ˆ β is the average of the individual estimates, algebraically: ˆ β = 1 K K k=1 ˆ βj . (1) The variance estimator, var(ˆ β), is the total variance of ˆ β, and is formed from the within-imputation variance W = (1/K) K k=1 Wk and the between-imputation variance B = (1/(K − 1)) K k=1 (ˆ βk − ˆ β)2. Rubin (1976) defines the variance estimator as var(ˆ β) = W + 1 + 1 K B. (2) Anna France Multiple Imputation July 4, 2017 10 / 27

Slide 11

Slide 11 text

Building the Imputation Models Including the outcome: The outcome of interest must be included, if not, it could falsely weaken the association between the variables and the outcome. For survival data, the outcome is Y = (T, D) It is essential to include the appropriate function of the survival outcome; White and Royston (2009) recommend including the censoring indicator D and the Nelson-Aalen estimator, H(T), of the cumulative hazard function H0(T). White et al. (2011) also recommend that the following must be included in the imputation models: Predictors for the missingness, Mi , of Xj . All the covariates to be included in the substantive model, including the outcome. Predictors for the observed values of Xj . Anna France Multiple Imputation July 4, 2017 11 / 27

Slide 12

Slide 12 text

Results Variable β-coefficient HR SE(β) p-value Age 0.0267 1.0271 0.0059 <0.0001 Centre -0.2496 0.7791 0.1601 0.1190 Pre-stroke Mobility (200m Outdoors) Indoors 0.3800 1.4623 0.1353 0.0050 Needs Help 0.4637 1.5899 0.1977 0.0190 Diabetes Mellitus 0.4120 1.5098 0.1630 0.0115 Side of Lesion (No Lesion) Right -0.2026 0.8166 0.1532 0.1860 Left 0.0639 1.0660 0.1588 0.6876 Both 0.8483 2.3357 0.2982 0.0044 Systolic BP -0.0477 0.9534 0.0136 0.0004 Quadratic Systolic BP 0.0001 1.0001 <0.0001 0.0015 Class/Conscious Level (Alert & LACS) Alert & PACS 0.0942 1.0988 0.1691 0.5774 Alert & POCS -0.4667 0.6271 0.4079 0.2526 Alert & TACS 0.2857 1.3307 0.2212 0.1966 Alert & Unclassified 0.4692 1.5987 0.2825 0.0967 Drowsy & LACS 0.6852 1.9842 0.3483 0.0491 Drowsy & PACS 0.5791 1.7844 0.2769 0.0365 Drowsy & POCS 0.3261 1.3856 0.6504 0.6161 Drowsy & TACS 0.3653 1.4409 0.5476 0.5047 Unconscious 1.2219 3.3936 0.1756 <0.0001 CT Scan: Lesion Type (No Lesion) CI 0.1778 1.1946 0.1639 0.2781 HCI 0.7517 2.1206 0.3249 0.0207 PICH 0.5886 1.8015 0.2672 0.0276 No Scan @ Centre A 1.1154 3.0508 0.2304 <0.0001 No Scan @ Centre B 0.7854 2.1933 0.1998 0.0001 Anna France Multiple Imputation July 4, 2017 12 / 27

Slide 13

Slide 13 text

Schoenfeld Residuals Anna France Multiple Imputation July 4, 2017 13 / 27

Slide 14

Slide 14 text

How to Handle Non-Proportional Hazards For the substantive model, we intend to fit a piecewise-proportional hazards model constructed through interactions with a time dependent covariate. To understand how the imputation models need adapting to accommodate the interactions with a time dependent covariate, we must first examine how the imputation models were constructed initially. Each variable type requires a different imputation model; we will focus on binary X and a logistic regression model. Anna France Multiple Imputation July 4, 2017 14 / 27

Slide 15

Slide 15 text

Cox PH Model The Cox Proportional Hazards model is defined as; h(t|X, Z) = h0(t) exp(βX X + βZ Z). Given the complete data, the log-likelihood for the outcomes is; log p(T, D|X, Z) = D log h(T|X, Z) − H(T|X, Z) = D(log h0(T) + βX X + βZ Z) − H0(T) exp(βX X + βZ Z). By Bayes’ theorem, the conditional distribution of X given the observed data is; log p(X|T, D, Z) = log p(X|Z) + log p(T, D|X, Z) − log p(T, D|Z) = log p(X|Z) + D(βX X + βZ Z) − H0(T) exp(βX X + βZ Z) + c where the constant, c, may depend on D, T and Z, but not X. Anna France Multiple Imputation July 4, 2017 15 / 27

Slide 16

Slide 16 text

Binary X Using the exposure model, logit p(X = 1|Z) = ζZ , we have logit p(X = 1|T, D, Z) = log p(X = 1|T, D, Z) − log p(X = 0|T, D, Z) = log p(X = 1|Z) + D(βX + βZ Z) − H0(T)eβX +βZ Z − [log p(X = 0|Z) + D(βZ Z) − H0(T)eβZ Z ] = log p(X = 1|Z) − log p(X = 0|Z) + DβX − H0(T)eβX +βZ Z + H0(T)eβZ Z = ζZ + DβX − H0(T)(eβX − 1)eβZ Z Note: This is not a standard logistic regression model due to the exp(βZ Z) term on the right hand side. Anna France Multiple Imputation July 4, 2017 16 / 27

Slide 17

Slide 17 text

Binary X For Binary X, we have the imputation model; logit p(X = 1|T, D, Z) = ζZ + DβX − H0(T)(eβX − 1)eβZ Z . (3) Taking the simplest case, if there is no Z, then we have logit p(X = 1|T, D) = ζ + DβX − H0(T)(eβX − 1). This is a logistic regression on D and H0(T), hence we have the logistic regression model; logit p(X = 1|T, D) = α0 + α1D + α2H0(T) (4) Anna France Multiple Imputation July 4, 2017 17 / 27

Slide 18

Slide 18 text

Binary X, General Z There are no exact results for the general case. If we assume the exposure model logit p(X = 1|Z) = ζ0 + ζ1Z, and for small var(βZ Z), use the approximation exp(βZ Z) ≈ exp(βZ ¯ Z) in equation (3), then we get logit p(X = 1|T, D, Z) ≈ ζ0 + ζ1Z + DβX − H0(T)(eβX − 1)eβZ ¯ Z , (5) giving a logistic regression on D, H0(T) and Z: logit p(X = 1|T, D, Z) = α0 + α1D + α2H0(T) + α3Z. (6) A more accurate approximation, exp(βZ Z) ≈ exp(βZ ¯ Z{1 + βZ (Z − ¯ Z)}, gives logit p(X = 1|T, D, Z) ≈ ζ0 + ζ1Z + DβX − H0(T)eβZ ¯ Z (eβX − 1){1 + βZ (Z − ¯ Z)}, a logistic regression on D, H0(T), Z and the interaction H0(T) × Z. Anna France Multiple Imputation July 4, 2017 18 / 27

Slide 19

Slide 19 text

Piecewise Cox PH Model Let X have a time-dependent covariate effect, such that the Cox PH model is defined as: h(t|X) = h0(t) exp(βX1X + βX2I(t > t0)X + βZ Z), where I(t > t0) = 1, if t > t0 0, if t ≤ t0 We also now have two censoring indicators: D1 = D, if T ≤ T0 0, otherwise D2 = D, if T > T0 0, otherwise Anna France Multiple Imputation July 4, 2017 19 / 27

Slide 20

Slide 20 text

Conditional Distribution of X The log-likelihood for the outcomes is then log p(T, D1, D2|X, Z) = D1 (log h0 (T) + βX1 X + βZ Z) + D2 (log h0 (T) + βX1 X + βX2 X + βZ Z) − H0 (min(T0, T))eβX1X+βZ Z + (H0 (T) − H0 (min(T0, T)))eβX1X+βX2X+βZ Z . By Bayes’ theorem, the conditional distribution of X given the observed covariates is; log p(X|T, D1, D2, Z) = log p(X|Z) + D1 (βX1 X + βZ Z) + D2 (βX1 X + βX2 X + βZ Z) − H0 (min(T0, T))eβX1X+βZ Z + (H0 (T) − H0 (min(T0, T)))eβX1X+βX2X+βZ Z + c where the constant, c, may depend on D1 , D2 , T and Z, but not X. Anna France Multiple Imputation July 4, 2017 20 / 27

Slide 21

Slide 21 text

Binary X Using the most general exposure model, logit p(X = 1|Z) = ζZ , we have logit p(X = 1|T, D1, D2, Z) = log p(X = 1|T, D1, D2, Z) − log p(X = 0|T, D1, D2, Z) = ζZ + D1βX1 + D2 (βX1 + βX2 ) − H0 (min(T0, T))(eβX1 − 1) + (H0 (T) − H0 (min(T0, T)))(eβX1+βX2 − 1) eβZ Z Note: This is not a standard logistic regression model due to the exp(βZ Z) term on the right hand side. Anna France Multiple Imputation July 4, 2017 21 / 27

Slide 22

Slide 22 text

Binary X, no Z For Binary X, we have the imputation model; logit p(X = 1|T, D1, D2, Z) = ζZ + D1βX1 + D2 (βX1 + βX2 ) − H0 (min(T0, T))(eβX1 − 1) + (H0 (T) − H0 (min(T0, T)))(eβX1+βX2 − 1) eβZ Z . Considering the simplest case, if there is no Z, then we have logit p(X = 1|T, D1, D2 ) = ζ + D1βX1 + D2 (βX1 + βX2 ) − H0 (min(T0, T))(eβX1 − 1) + (H0 (T) − H0 (min(T0, T)))(eβX1+βX2 − 1) . This is a logistic regression on D1, D2, H0(min(T0, T)) and H0(T) − H0(min(T0, T)), hence we have the logistic regression model; logit p(X = 1|T, D1, D2 ) = α0 + α1 D1 + α2 D2 + α3 H0 (min(T0, T)) + α4 (H0 (T) − H0 (min(T0, T))) Anna France Multiple Imputation July 4, 2017 22 / 27

Slide 23

Slide 23 text

Binary X, General Z As before, for general Z, we have no exact results. Using the exposure model, logit p(X = 1|Z) = ζ0 + ζ1 Z, and the approximation exp(βZ Z) ≈ exp(βZ ¯ Z), for small var(βZ Z), we get; logit p(X = 1|T, D1, D2, Z) = ζ + D1βX1 + D2 (βX1 + βX2 ) − H0 (min(T0, T))(eβX1 − 1) + (H0 (T) − H0 (min(T0, T)))(eβX1+βX2 − 1) exp(βZ ¯ Z). This is a logistic regression on Z, D1 , D2 , H0 (min(T0, T)) and H0 (T) − H0 (min(T0, T)), giving the imputation model; logit p(X = 1|T, D1, D2, Z) = α0 + α1 D1 + α2 D2 + α3 H0 (min(T0, T)) + α4 (H0 (T) − H0 (min(T0, T))) + α5 Z Any transformation of Z needed for predicting X should also be included in the imputation model. Anna France Multiple Imputation July 4, 2017 23 / 27

Slide 24

Slide 24 text

Results Variable β-Estimate HR SE(β) p-value Age Age (ST ≤ 30 days) -0.0017 0.9983 0.0077 0.8253 Age (ST >30 days) 0.0599 1.0617 0.0117 <0.0001 Centre Centre A (Baseline) Centre B -0.2783 0.7571 0.1617 0.0853 Side of Lesion No Lesion (Baseline) Right -0.0868 0.9169 0.1565 0.5791 Left 0.1601 1.1736 0.1608 0.3193 Both 0.9543 2.5969 0.3217 0.003 Systolic BP -0.0497 0.9515 0.0139 0.0003 Quadratic Systolic BP 0.0001 1.0001 <0.0001 0.0021 Pre-stroke Mobility 200m Outdoors (baseline) Indoors (ST ≤ 30 days) 0.2693 1.3090 0.2028 0.1842 Indoors (ST >30 days) 0.3865 1.4718 0.2614 0.6539 Needs Help (ST ≤ 30 days) 0.0513 1.0526 0.2603 0.8438 Needs Help (ST >30 days) 1.1886 3.2825 0.3554 0.0014 Anna France Multiple Imputation July 4, 2017 24 / 27

Slide 25

Slide 25 text

Results (continued) Variable β-Estimate HR SE(β) p-value CT Scan: Lesion Type No Lesion (baseline) CI (ST ≤ 30 days) 0.0277 1.0281 0.3184 0.9308 CI (ST >30 days) 0.2167 1.2420 0.3706 0.61 HCI (ST ≤ 30 days) 0.804 2.2345 0.5741 0.1614 HCI (ST >30 days) 0.8527 2.3460 0.6757 0.9425 PICH (ST ≤ 30 days) 1.0872 2.9660 0.3575 0.0024 PICH (ST >30 days) -0.0377 0.9630 0.4725 0.0173 No scan @ Centre A (ST ≤ 30 days) 1.8037 6.0721 0.3321 <0.0001 No scan @ Centre A (ST >30 days) 0.0998 1.1050 0.6183 0.0059 No scan @ Centre B (ST ≤ 30 days) 1.3553 3.8779 0.3198 <0.0001 No scan @ Centre B (ST >30 days) 0.2842 1.3287 0.3869 0.0056 Class/Conscious Level Alert & LACS (Baseline) Alert & PACS 0.1725 1.1883 0.168 0.3047 Alert & POCS -0.4766 0.6209 0.4043 0.2385 Alert & TACS 0.3614 1.4353 0.2219 0.1034 Alert & Unclassified 0.5918 1.8072 0.2784 0.0335 Drowsy & LACS 0.8133 2.2553 0.335 0.0152 Drowsy & PACS 0.6201 1.8591 0.2684 0.0209 Drowsy & POCS 0.3976 1.4882 0.5852 0.4969 Drowsy & TACS 0.5162 1.6756 0.5224 0.3232 Unconscious 1.1927 3.2960 0.1738 <0.0001 Anna France Multiple Imputation July 4, 2017 25 / 27

Slide 26

Slide 26 text

Further Work Model validation Discuss findings with clinicians Extensions to other forms of handling non-proportional hazards Diagnostics for MI in a Cox framework Anna France Multiple Imputation July 4, 2017 26 / 27

Slide 27

Slide 27 text

References Bartlett, J. W., Seaman, S. R., White, I. R., and Carpenter, J. R. (2015). Multiple imputation of covariates by fully conditional specification: accommodating the substantive model. Statistical methods in medical research, 24(4):462–487. Rubin, D. B. (1976). Inference and missing data. Biometrika, 63(3):581–592. Sterne, J. A., White, I. R., Carlin, J. B., Spratt, M., Royston, P., Kenward, M. G., Wood, A. M., and Carpenter, J. R. (2009). Multiple imputation for missing data in epidemiological and clinical research: potential and pitfalls. Bmj, 338:b2393. White, I. R. and Royston, P. (2009). Imputing missing covariate values for the cox model. Stat Med, 28(15):1982–1998. White, I. R., Royston, P., and Wood, A. M. (2011). Multiple imputation using chained equations: Issues and guidance for practice. Statistics in Medicine, 30(1):377–399. Anna France Multiple Imputation July 4, 2017 27 / 27