Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Richard Emsley

Richard Emsley

SAM Conference 2017

July 04, 2017
Tweet

More Decks by SAM Conference 2017

Other Decks in Research

Transcript

  1. New approaches to longitudinal causal mediation analysis Professor Richard Emsley

    Centre for Biostatistics, School of Health Sciences, The University of Manchester, Manchester Academic Health Science Centre Research and Methodology Director, MAHSC Clinical Trials Unit MRC North West Hub for Trials Methodology Research Statistical Analysis of Multi-Outcome Data, Liverpool Tuesday 4th July 2017
  2. Research Programme: Efficacy and Mechanisms Evaluation Joint work with Graham

    Dunn, Ian White, Andrew Pickles and Sabine Landau. Funded by Medical Research Council Methodology Research Programmes: • Design and methods of explanatory (causal) analysis for randomised trials of complex interventions in mental health (2006-2009)  Graham Dunn (PI), Richard Emsley, et al • Estimation of causal effects of complex interventions in longitudinal studies with intermediate variables (2009-2012)  Richard Emsley (PI), Graham Dunn. • MRC Early Career Centenary Award (2012-13) • Designs and analysis for the evaluation and validation of social and psychological markers in randomised trials of complex interventions in mental health (2010-12)  Graham Dunn (PI), Richard Emsley, et al. • Developing methods for understanding mechanism in complex interventions (2013-16)  Sabine Landau (PI), Richard Emsley, et al. • MRC NorthWest Hub for Trials Methodology Research (2013-2018)  Paula Williamson (PI), Richard Emsley, et al.
  3. 1. Does it work?  Efficacy analysis 2. How does

    it work?  Mediation analysis 3. Who does it work for?  Stratified/personalised medicine 4. What factors make it work better?  Process evaluation EME: four key questions about treatments
  4. • Dunn G, Emsley RA, Liu H, Landau S, Green

    J, White I and Pickles A. (2015). Evaluation and validation of social and psychological markers in randomised trials of complex interventions in mental health. Health Technology Assessment 19 (93). • Non-technical introduction and summary of our work on analysing complex interventions:  Introduction to CI  Mediation analysis  Process evaluation  Longitudinal extensions  Stratified medicine  Guidance and tips for trialists Methodology report
  5. Motivation: the mediation industry • Baron RM & Kenny DA

    (1986). The moderator-mediator variable distinction in social psychological research: conceptual, strategic, and statistical considerations. Journal of Personality and Social Psychology 51, 1173-1182. • As of 3rd July 2017: 30,805 citations and hundreds more each month! • Depends on the implicitly-assumed absence of hidden confounding which is very rarely stated, let alone its validity discussed.  One suspects that the majority of investigators are oblivious of the assumptions and their implications. • “One is left with the unsettling thought that the thousands of investigations of mediational mechanisms in the psychological and other literatures are of unknown and questionable value” (Emsley et al, 2010).
  6. • Baron and Kenny (1986) defined mediation as the “generative

    mechanisms through which the focal independent variable is able to influence the dependent variable of interest” • A mediator (M) is a variable that occurs in the causal pathway from an exposure (R) to an outcome variable (Y). It causes variation in the outcome and itself is caused to vary by the exposure variable.  This causal chain implies a temporal relation • R occurs before M and • M occurs before Y • Mediating variables are often called intervening or intermediate variables. Mediation and mediators
  7. • To reflect the mediated effect we need a path

    from R to M and a further path from M to Y. • The following diagram illustrates complete mediation by M. • Note that the diagram implies that M is the only mechanism by which R can change Y. Complete mediation R Y M
  8. • We might not want to rule out effects of

    R on Y other than those operating by changing M. • The following triangle illustrates partial mediation by M. “The mediation triangle” R Y M
  9. • Clinical psychology and psychiatry:  Aetiology: Negative life event

    increase hopelessness and hopelessness leads to depression.  Treatment: Cognitive behavioural therapy improves cognition which in turn improves functioning. • Prevention research:  Public health interventions typically based on a mediation theory; • E.g. better school meals to improve nutrition to improve health. • E.g. programmes to prevent relapse from addiction aimed at preventing further health problems. Mediation examples
  10. • Mediation investigations aim to partition total (causal) treatment effects

    into 1. effects that operate via changing the putative mediator – so called indirect treatment effects 2. and non-mediated effects – so-called direct treatment effects. • Note that direct effects include effects via any mediating variable not included in the model.  So the meaning of a direct effect is always relative to the variable whose mediating effect is being modelled. Direct and indirect effects
  11. Mediation analysis and causal inference… “Mediation analysis is a form

    of causal analysis…all too often persons conducting mediational analysis either do not realize that they are conducting causal analyses or they fail to justify the assumptions that they have made in their casual model.” David Kenny (2008), Reflections on Mediation, Organizational Research Methods.
  12. • Two main schools in the literature for analysing of

    mediation:  Statistical Mediation Analysis • Social sciences / psychometrics (B&K, MacKinnon, 1986)  Causal Mediation Analysis • Causal inference (Robins and Greenland, 1992; Pearl, 2001; VanderWeele, 2015) • First is more accessible, widely used but also misused. • Second more rigorous and more general, but more complex Analysing mediation
  13. • Statistical mediation analysis  Builds on Judd & Kenny

    (1981) and Baron & Kenny (1986)  Structural Equation Models  Monograph by David MacKinnon (2008).  Work by Kris Preacher and Andrew Hayes. Statistical mediation analysis
  14. Traditional regression approach • It is based on two regression

    models:  Model for mediator (M): , = + +  Model for outcome (Y): , , = + + +  (X are baseline covariates that act as observed confounders.) • The indirect effect is given by  • In the absence of a R x M interaction the direct effect is given by  • Note that the total treatment effect is +
  15. • is the direct effect of R on Y (not

    through M) • × is the indirect effect (through M) Linear structural equation models , = + + , , = + + + R M Y X
  16. Baron and Kenny steps • Stepwise procedure for demonstrating mediation:

    1. Demonstrate that treatment has a total effect on outcome  Regress outcome on treatment and covariates 2. Demonstrate that treatment has an effect on the mediator  Regress mediator on treatment and covariates 3. Demonstrate that the mediator has an effect on outcome, after controlling for treatment.  Regress outcome on treatment AND mediator and covariates; check effect of mediator on outcome is significant and estimate of treatment on outcome reduced in magnitude relative to the estimate obtained in step 1.
  17. Random allocation (R) Surrogate Outcome (S) True Outcome (T) •

    A strong correlation between the biomarker and the clinical endpoint is not sufficient (or maybe even necessary) to conclude that a biomarker is good surrogate • Prentice (Bcs 1989)  “A test of H 0 of no effect of treatment on surrogate is equivalent to a test of H 0 of no effect of treatment on true endpoint.” Aside: surrogate outcomes
  18. 1. Treatment R is prognostic for the surrogate S 

    α is significant in model: = + + 2. Treatment R is prognostic for true endpoint T  β is significant in model: = + + 3. Surrogate S is prognostic for true endpoint T  γ is significant in model: = + + 4. The full effect of the treatment R on the true endpoint T is explained by S: = + + + Aside: Prentice Criteria (1989)
  19. Confounded mediation in trials Random Allocation R Mediator M Outcome

    Y U U – the unmeasured confounders Covariates X error error
  20. • Possible confounding of the effect of mediator on outcome

    (the omitted variables problem). • Mediator measured with error. • Data structure: e.g. serial assessments of both mediator and clinical outcome, or serial assessments of the mediator and a survival time for the outcome. • Possibility of multiple mediators working in parallel. Challenges for establishing mediation
  21. • The Baron and Kenny (1986) procedure and subsequent estimation

    of the indirect effect can be appropriate if besides linear relationships with Y :  No measurement error in M  No interaction effect M x R on Y  No confounding by post-randomisation variables (all measured X are baseline variables).  No unmeasured confounding of the effect of M on Y • Using the bootstrap option is recommended for estimating the standard error of the indirect effect. • This applies for the use of structural equation modelling more generally too. Summary: Statistical mediation analysis
  22. • Statistical mediation (B&K) has four main problems: 1. Unmeasured

    confounding between mediator and outcome 2. No interactions between exposure and mediator on outcome 3. Doesn’t easily extend to non-linear models 4. Assumes correctly specified models • Causal mediation analysis has arisen from the causal inference literature, and addressed these problems. • Formally defines the causal mediation parameters. Causal mediation analysis
  23. • It is based on two regression models:  Model

    for mediator (M): , = + +  Model for outcome (Y): , , = + ′ + + ∗ + • What is the direct effect?  Depends on the level of M • What is the indirect effect? • Depends on the level of R Traditional regression approach: problem with interaction
  24. • Define outcome Y to be counterfactual in both the

    level of the randomisation R and the level of the mediator M; then for individual i:  Y(R=0, M=m) = Y(0, m): The outcome that would be observed if the person was randomised to control R=0 and the level of the mediator was fixed to value M=m.  Y(R=1, M=m) = Y(1, m): The outcome that would be observed if the person was randomised to treatment R=1 and the level of the mediator was fixed to value M=m. Causal effects
  25. • The individual treatment effect can be written as: ITE

    = Y(1) - Y(0) = Y(1,M(1)) - Y(0,M(0)) • We can define an individual’s controlled direct effect as the direct effect of treatment on outcome at mediator set to M=m, i.e. Y(1, m) - Y(0, m) with average value CDE = E[ Y(1,m) - Y(0,m) ]  In the first term, R is set to 1  In the second term, R is set to 0  In BOTH terms, M is set to m  This gives a direct effect, unmediated by M Controlled direct effect
  26. CDE with and without interaction • Controlled direct effect: Y(1,m)

    - Y(0,m)  Direct effect of randomisation on outcome at mediator level m. m CDE No interaction With interaction
  27. • But we are typically not interested in this; rather

    we might like to know the effect of treatment when the mediator takes its “natural” level under the control condition M(0). • This leads to the definition of the natural direct effect (NDE): = [ (1, (0)) − (0, (0)) ]  In the first term, R is set to 1  In the second term, R is set to 0  In BOTH terms, M is set to M(0), the value if R=0 (unexposed)  Since M is the same within subject, this gives a direct effect, unmediated by M  If no individual level interaction between D and M, CDE(m) = NDE ∀ m Natural direct effects
  28. • This leads to the definition of the natural indirect

    effect (NIE): = [(1, (0)) − (0, (0))] = [(1, (1)) − (1, (0))] • The NIE is the effect of change in the mediator on clinical outcome if randomised to treatment (R=1).  In the first term, M is set to M(1)  In the second term, M is set to M(0)  In both terms, R is set to 1  R is only allowed to influence Y through its influence on M Natural direct and indirect effects
  29. • These definition allow us to partition the average treatment

    effects: Total effect = NDE + NIE • Also after conditioning on X=x: Total effect|X = NDE|X + NIE|X ) Natural direct, indirect and total effects
  30. • Wide range of options for most combinations of M

    and Y • Work on identification and estimation of direct and indirect causal effects using parametric regression models  VanderWeele and Vansteelandt (2009, 2010):  Outcomes can be continuous, binary, poisson or negative binomial.  Mediators can binary or continuous. • G-computation is flexible and efficient but requires parametric modelling assumptions:  correct specification of all relevant conditional expectations and distributions  gformula command in Stata (Daniel et al., 2011) • Semi-parametric methods make fewer parametric assumptions:  Inverse probability of treatment weighting (IPTW):  G-estimation Estimating causal parameters
  31. Assumptions for identification 1. There are no unmeasured exposure-outcome confounders

    given X 2. There are no unmeasured mediator-outcome confounders given X 3. There are no unmeasured exposure-mediator confounders given X 4. The mediator-outcome confounders are not affected by exposure 1. (, ) | | | 2. (, ) | | | , 3. () | | | 4. (, ) | | ( ∗) | • Assumptions 1 and 3 are satisfied by randomisation
  32. • Two regression models  = , = = 0

    + 1 + 2  = , = , = = 0 + 1 + 2 + 3 + 4 • Using parameters from these models, we can obtain the causal parameters as: = 1 + 3 − ′ = 1 + 3 0 + 3 1 ′ + 2 2 − ′ = (2 1 + 3 1 r)(r − r′) • Without interactions ( 3 = 0), these become the parameters obtained from the traditional method multiplied by levels of the exposure = {, ’} Estimating causal parameters using parametric regression models
  33. Estimating causal parameters using parametric regression models • With a

    binary mediator, the model for the mediator becomes: Pr = 1 = , = = 0 + 1 + 2 = , = , = = 0 + 1 + 2 + 4 • And the causal parameters are: = 1 + 3 − ′ = 1 − ′ + 3 − ′ exp 0 + 1 ′ + 2 1 + exp 0 + 1 ′ + 2 = (0 + 3 ) exp [0 + 1 + 2 ] 1 + exp [0 + 1 + 2 ] − exp [0 + 1 ′ + 2 ] 1 + exp [0 + 1 ′ + 2 ]
  34. Stata paramed command paramed varname, avar(varname) mvar(varname) a0(real) a1(real) m(real)

    yreg(string) mreg(string) [cvars(varlist) nointeraction casecontrol fulloutput c(numlist) bootstrap reps(integer 1000) level(cilevel) seed(passthru)] varname - this specifies the outcome variable. avar - this specifies the treatment variable. mvar - this specifies the mediator variable. a0 - this specifies the baseline level of the exposure. a1 - this specifies the new exposure level. m - this specifies the level of mediator at which the controlled direct effect is to be estimated.
  35. • Statistical mediation analysis:  Stata: sgmediation, SEM, GSEM 

    Mplus, EQS, AMOS, R – any SEM package  SPSS: PROCESS (Hayes, Preacher) • Causal mediation analysis in software;  Stata: paramed± (Emsley et al.), mediation* (Imai et al.), gformula~ (Daniel et al.)  SAS: gformula~ (Robins et al.), mediation ± (VanderWeele)  SPSS: mediation ± (VanderWeele)  R: mediation* (Imai et al.)  Mplus: (Muthen)  Lange et al. (2012) and Vansteelandt et al. (2012). Software for mediation analysis
  36. Mediation analysis in survival analysis • Suppose we aim to

    decompose a total effect comparing r and r’ into direct and indirect effects, but we have a time-to-event outcome. • Which of the methods that we have discussed can we use? • It turns out, that we can estimate causal mediation parameters using survival analysis, under the same set of assumptions 1-4. • The interesting thing in survival data is that there are multiple scales we can use to decompose the total effect:  Survival function  Hazard function  Mean survival times
  37. • When baseline values of the outcome variable have been

    measured alternative estimators are available: A. Difference in mean post treatment outcomes between trial arms (post estimator) B. Difference in change scores (over time) between trial arms (change score estimator) C. Conditioning on baseline value (ANCOVA estimator) • All are known to be unbiased for ATE in trials; ANCOVA estimator can be shown to be most precise. • How do we extend this to baseline mediators and outcomes? How to adjust for baseline?
  38. Conclusions on baseline adjustment • If mediation analysis is envisaged

    as an addition to total treatment effect estimation in an RCTs, then baseline values of the mediator as well as the outcome should be measured. • In contrast to total treatment effect estimation only the ANCOVA approach C can generally avoid bias in mediation parameters.  Mediation analyses that do not use the baseline values (Mediation approach A - post) will always incur bias for NIE and NDE.  Mediation analyses that incorporate the baseline values into change scores (Mediation approach B – change score) can only avoid bias when baseline measures do not predict change in the mediator. • The ANOVA approach is recommended to avoid confounding bias in mediation investigations. • Landau, Emsley and Dunn (2017), under review
  39. Even more realistic trial data Treatment Mediator Outcomes Baseline Mediator

    Outcomes Time 1 Mediator Outcomes Time 2 Mediator Outcomes Time 3
  40. Why use latent variable models? • The rationale for suitable

    latent variable models for multivariate response is that they:  Explain generation of responses by a well-defined set of latent “driving variables” • …so that relationships between mediator and clinical responses can later be explained by relationships between these sets of latent variables.  For longitudinal structures, these models are not restricted to equidistantly spaced time points.  They allow for inclusion of measurement errors, which is important when thinking of confounding (measured or hidden/unmeasured).
  41. Longitudinal mediation with SEM • The univariate model for the

    mediator is driven by the baseline levels (the random intercept, Im) and the linear change (the random slope, Sm).  the errors ε 1 , ε 2 and ε 3 are independent  the observed scores are explained by Im and Sm  the means and variances of Im and Sm are freely estimated  the covariance between Im and Sm is freely estimated M1 M2 M0 ε1 Im Sm ε2 ε3 2
  42. Longitudinal mediation with SEM • These univariate models are then

    combined in a bivariate growth model, and the treatment variable R included as a cause of the random slope of the mediator Sm, and the random slope of the outcome Sy. • Since R is randomly assigned, and Im represents the baseline values, we can assume that there is no covariance between R and Im. • The aim is to assess whether the intervention affects the growth trajectory of the mediating variable, which in turn affects the growth trajectory of the outcome variable.
  43. Longitudinal mediation with SEM Y2 Y1 Y0 ε4 ε6 ε5

    Iy Sy 2 ψ2 M1 M2 M0 ε1 Im Sm ε2 ε3 2 Z β1 ψ1
  44. Longitudinal mediation with SEM • Redefine the Baron and Kenny

    steps as follows: 1. demonstrate that treatment, R, has an effect on the slope of the outcome Sy 2. demonstrate that treatment, R, has an effect on the slope of the putative mediator Sm 3. demonstrate that the slope of the mediator Sm has an effect on the slope of the outcome Sy after controlling for treatment R. • Or: take estimates of these parameters to calculate the NIE and NDE using the formulas previously.
  45. Some references (personal) • Dunn G, Emsley RA, Liu H,

    Landau S, Green J, White I and Pickles A. (2015). Evaluation and validation of social and psychological markers in randomised trials of complex interventions in mental health. Health Technology Assessment 19(93). • Dunn G, Emsley RA, Liu H & Landau S. (2013). Integrating biomarker information within trials to evaluate treatment mechanisms and efficacy for personalised medicine. Clinical Trials, 10(5):709-19. • Emsley RA & Dunn G. Process evaluation using latent variables: applications and extensions of finite mixture models. (2013). In: Advances in Latent Variables. Eds. Brentari E., Carpita M., Vita e Pensiero, Milan, Italy. ISBN: 9788834325568. • Emsley RA & Dunn G. (2012) Evaluation of potential mediators in randomized trials of complex interventions (psychotherapies). In: Causal Inference: Statistical perspectives and applications. Eds: Berzuini C, Dawid P & Bernardinelli, L. Wiley. • Emsley RA, Dunn G & White IR. (2010). Modelling mediation and moderation of treatment effects in randomised controlled trials of complex interventions. Statistical Methods in Medical Research, 19(3), 237-270. • VanderWeele TJ, Emsley RA. (2013). Discussion of “Experimental designs for identifying causal mechanisms”. JRSS-A, 176(1), pp46.
  46. • Albert JM (2008). Mediation analysis via potential outcomes models.

    Statistics in Medicine 27, 1282-1304. • Daniel, R. M., De Stavola, B. L., and Cousens, S. N. (2011). gformula: Estimating causal effects in the presence of time-varying confounding or mediation using the g-computation formula. The Stata Journal 11(4):479-517. • De Stavola B., Daniel, R., Ploubidis, G., Micali, N. (2015). Mediation Analysis With Intermediate Confounding: Structural Equation Modeling Viewed Through the Causal Inference Lens. AJE, 181(1), 64-80. • Dunn G & Bentall R (2007). Modelling treatment-effect heterogeneity in randomized controlled trials of complex interventions (psychological treatments). Statistics in Medicine, 26, 4719-4745. • Gallop R, Small DS, Lin JY, Elliot MR, Joffe MM & Ten Have TR (2009). Mediation analysis with principal stratification. Statistics in Medicine 28, 1108-1130. • Gennetian, L.A., Morris, P.A., Bos, J.M. and Bloom, H.S. (2005). In H.S. Bloom (Ed.), Learning More From Social Experiments (pp75-114). New York: Russell Sage Foundation. Lynch K, Cary M, Gallop R, Ten Have TR (2008). Causal mediation analyses for randomized trials. Health Services & Outcomes Research Methodology 8, 57-76. • Hicks R and Tingley D. Causal Mediation Analysis. The Stata Journal, 11(4):609-15, 2011. • Imai, Kosuke, Luke Keele and Dustin Tingley (2010) A General Approach to Causal Mediation Analysis, Psychological Methods 15(4) pp. 309-334. • Imai, Kosuke, Luke Keele and Teppei Yamamoto (2010) Identification, Inference, and Sensitivity Analysis for Causal Mediation Effects, Statistical Sciences, 25(1) pp. 51-71. Some references (others)
  47. • MacKinnon DP (2008). Introduction to Statistical Mediation Analysis. New

    York: Taylor & Francis Group. • Ten Have, T.R., Joffe, M. and Cary, M. (2003). Causal logistic models for non-compliance under randomized treatment with univariate binary response. Statistics in Medicine 22, 1255-1283. • Ten Have TR, Joffe MM, Lynch KG, Brown GK, Maisto SA & Beck AT (2007). Causal mediation analyses with rank preserving models. Biometrics 63, 926-934. • Valeri L and VanderWeele TJ. Mediation analysis allowing for exposure-mediator interactions and causal interpretation: theoretical assumptions and implementation with SAS and SPSS macros. Psychological Methods, 18(2):137-50, 2013. • VanderWeele, T. J. & Vansteelandt, S. 2009, "Conceptual issues concerning mediation, interventions and composition", Statistics and Its Interface, vol. 2, no. 4, pp. 457-468. • VanderWeele, T. J. & Vansteelandt, S. 2010, "Odds Ratios for Mediation Analysis for a Dichotomous Outcome", American Journal of Epidemiology, vol. 172, no. 12, pp. 1339-1348. • VanderWeele, T. J. & Arah, O. A. 2011, "Bias Formulas for Sensitivity Analysis of Unmeasured Confounding for General Outcomes, Treatments, and Confounders", Epidemiology, vol. 22, no. 1, pp. 42-52. Some references (others)