Work Log 08/09 - Speaker Deck

Slide 1

Slide 1 text

Work Log 08/09 Recurrent Analysis Introduction Python(, R) Tutorial Lab Future Plan 2013.09.01 Bioinformatics and Biostatistics Core, NTU Center of Genomic Medicine, National Taiwan University Slides by Liang Bo Wang

Slide 8

Slide 8 text

2013.09.01 Bioinformatics and Biostatistics Core, NTU Center of Genomic Medicine, National Taiwan University Slides by Liang Bo Wang 42 Biometrics, March 2010 The score function derived from the logged pairwise pseudolikelihood can be expressed as i< k I(t Yik , u Yik ) × − exp{ρik (t, u) β} 1 + exp{ρik (t, u) β} ρik (t, u) dNi (t) dNk (u), where Yik = Yi ∧ Yk . Recall that Di and Dk denote the observed data of the ith and the kth subject. Define the function h(Di , Dk ; β) = I(t Yik , u Yik ) × − exp{ρik (t, u) β} 1 + exp{ρik (t, u) β} ρik (t, u) dNi (t) dNk (u), and denote the score function by Sp (β) = 1 (n 2 ) i< k h(Di , Dk ; β). It is easy to see that h is permutation symmetric in its ar- guments and Sp is a U-statistic with the kernel h(·, ·). If β is the true parameter value, it can be shown that the score function Sp (β) = 0. Applying the projection method devel- oped by Hoeffding (1948) and under the assumption that X(t) is bounded by M, we can show that score function √ nSp (β) converges to a normal distribution with mean 0 and variance– covariance V 1 = 4E{h(D1, D2 ; β) h(D1, D3 ; β) }. We then study the asymptotic properties of ˆ β using delta method. The large sample properties of ˆ β are stated in Theorem 1, with proofs given in the Appendix. Theorem 1: Assume that X(t) is bounded by M and E[N(τ)] < ∞. Let ˆ β be the solution of Sp (β) = 0. Then ˆ β is a consistent estimator of β. Furthermore, √ n(ˆ β − β) → N(0, V ), where V = V −1 2 V 1 V −1 2 with V 1 = 4 E{h(D1 , D2 ; β) h(D1 , D3 ; β) } and V 2 = −E{∂h(D1 , D2 ; β)/∂β}. Note that the variance covariance matrix V can be estimated by ˆ V −1 2 ˆ V 1 ˆ V −1 2 , where ˆ V 1 = 4 n n i=1 1 n −1 2 i< j < k h(Di , Dj ; ˆ β)h(Di , Dk ; ˆ β) , and ˆ V 2 = − 1 (n 2 ) i< k ∂h(Di , Dk ; ˆ β) ∂β . Also note that based on the score function Sp , a test statistic for testing the hypothesis β = 0 can be formulated based on 1 (n 2 ) i< k Ni (Yik )Nk (Yik ) Yi k 0 {Xi (u) − Xk (u)} × dNi (u) Ni (u) − dNk (u) Nk (u) . Interestingly, this test statistic for the effects of time- dependent covariates does not require information about time-independent covariates, and hence is useful for check- ing proportionality assumption in the usual proportional rate model. 3.2 Estimation of Λ0 and γ By definition πi (t) can be considered as the distribution function of a biased sample from the distribution F (t) = Λ0 (t)/Λ0 (τ), where the observations are sampled with a probability proportional to exp{Xi (t) β}. It is easy to see that, under the assumption (1), the conditional likelihood (2) is computationally equivalent to the likelihood of a set of independent random variables, where the data are a biased sample from distribution function F(t) with sampling weight proportional to exp{Xi (t) β} and are right truncated by Yi . Thus event times in the risk set are observed with different sampling probabilities, where the probabilities are proportional to exp{Xi (t) β}. If β is known, the probability structure of the risk set can be recovered by using the inverse probability weighting technique. Following the spirit of the Breslow estimator of the cumulative hazard in the Cox model, we modify the truncation product-limit estimator (Wang et al., 1986) as follows to estimate F by assigning each event in the risk set a weight that is proportional to the inverse of the sampling weight function: sl > t 1 − dl (β) Rl (β) , where dl (β) = 1 n n i=1 m i j =1 I(tij = sl ) exp{−Xi (tij ) β} and Rl (β) = 1 n n i=1 m i j =1 I(tij sl Yi ) exp{−Xi (tij ) β}. Note that β = 0 implies that the assigned weight is a unit weight and the proposed estimator reduces to the product-limit estimator that maximizes the nonparametric likelihood function for truncated data. By replacing β with ˆ β, we estimate F with ˆ F (t; ˆ β) = sl > t 1 − dl (ˆ β) Rl (ˆ β) . The large sample properties of ˆ F (t) are stated in Lemma 1, with proofs given in the Appendix. Lemma 1: Assume that (a) Λ0 (τ) > 0, (b) P r(Y > τ, Z > 0) > 0, and (c) G(u) = E{ZI(Y u) exp(W γ)} is a con- tinuous function for u ∈ [0, τ]. For inf{y : Λ0 (y) > 0} < t τ, √ n{ ˆ F (t) − F (t)} converges weakly to a normal distribution with mean 0 and variance 4F (t)2 E{κ(D1 , D2 ; t, β) κ(D1 , D3 ; t, β)}, where κ is defined in the Appendix. To estimate the regression parameters of time-independent covariates, we note that, conditioning on {Zi , Yi , W i , Xi (Yi )}, the expected value of mi is given by E{mi |Zi , Yi , W i , Xi (Yi )} = Zi Λ0 (τ) exp{W i γ} I(Yi u) exp{Xi (u) β} dF (u). Thus, following the assumption E[Zi | W i , Xi (τ)] = 1 and by double expectation, we have E        mi Yi 0 exp{Xi (u) β} dF (u) | W i        = Λ0 (τ) exp(W i γ). Recurrent Event Data Analysis 41 Assumption (1) implies that the occurrence of recurrent events follows a proportional intensity model, where the unobserved frailty Zi inflates/deflates the intensity. Because of the memoryless property of a Poisson process, conditional on Zi , the rate function equals the intensity function of the recurrent event process. Under (1) and E[Zi | W i , Xi (τ)] = 1, the rate function of event occurrence at time t in a random population is given by λ0 (t) exp{Xi (t) β + W i γ}. Thus (1) implies the proportional rate model for recurrent event data studied by Lin et al. (2000) and many others. The proposed model also reduces to the semiparametric model studied in Wang et al. (2001) in the absence of time-dependent covariate Xi (t), and is in line with the model for case series data studied in Farrington and Whitaker (2006) in the absence of time-independent covariate Wi . 3. Estimation Procedure 3.1 Estimation of β We denote by mi the number of recurrent events that occurred before time Yi and ti1, . . . , tim i the observed event times for subject i. For ease of notation, we use mi and tij , i = 1, 2, . . . , n, j = 1, 2, . . . , mi , to denote either random variables or realized values. Let Di denote the observed data of the ith subject, that is, Di = {(Yi , W i , Xi (Yi ), mi , (ti1, . . . , tim i )}. As- sume that {(Zi , Xi (·), W i , Yi , Ni (·)); i = 1, . . . , n} are independent and identically distributed (i.i.d.), so that the {D1 , . . . , Dn } are also i.i.d. For estimation of the regression parameter β, an initial at- tempt might apply the conditional technique used in Wang et al. (2001) to eliminate nuisance parameters from the likelihood. Under (1) the event times (ti1, . . . , tim i ) of the ith subject conditional on (Zi , Yi , mi , W i , X(Yi )) are order statistics of a set of i.i.d. random variables with the density function Zi λ0 (t) exp{Xi (t) β + W i γ} Yi 0 Zi λ0 (u) exp{Xi (t) β + W i γ}du , 0 t Yi . Note that both the unobserved frailty Zi and the time- independent covariates Wi are eliminated from the conditional density function. The conditional likelihood based on all subjects is proportional to n i=1 m i j =1 λ0 (t) exp{Xi (tij ) β} Yi 0 λ0 (u) exp{Xi (u) β}du = n i=1 m i j =1 dπi (tij ) πi (Yi ) , (2) where dπi (t) is the shape function of the recurrent event process given by dπi (t) = λ0 (t) exp{Xi (t) β} τ 0 λ0 (u) exp{Xi (u) β}du I(0 t τ), and πi (t) = t 0 dπi (u). Note that πi defines a proper distribution function with πi (τ) = 1. When the recurrent event model only includes time- independent covariates, the conditional density function fur- ther reduces to λ0 (t)/Λ0 (Yi ), which is in the form of a truncated density function. It is easy to see that for this special case the conditional likelihood depends only on λ0 and is computationally equivalent to the nonparametric likelihood of independently right-truncated data. Hence the reduced conditional likelihood is maximized by the product-limit estimator for independently right-truncated data (Wang, Jewell, and Tsai, 1986). In the presence of time-varying covariates, however, the conditional likelihood (2) involves both the paramet- ric component Xi (t) β and the nonparametric component λ0 . Maximizing the conditional likelihood function is a challenge because the integral in the denominator of the conditional likelihood does not have a closed form with λ0 unspecified. Motivated by Liang and Qin (2000) and Kalbfleisch (1978), we propose an alternative estimation procedure for β that does not involve the nonparametric component λ0 , and hence has the advantage of computational convenience. Because (2) is computationally equivalent to the semiparametric likelihood of a set of independently right-truncated random variables, we can reformulate the problem as esti- mating the regression parameter β using the data {tij , i = 1, . . . , n, j = 1, . . . , mi }, where tij is an observed event time with the distribution function πi and is subject to independent right truncation Yi . The pairwise pseudolikelihood method considered by Liang and Qin (2000), however, can- not be applied directly to truncated data: event times are not necessarily comparable because they are subject to different truncation times. The observation of tij is subject to the con- straint tij Yi , hence any two event times tij and tkl , i = k, are comparable if tij belongs to the observation interval of tkl and tkl belongs to the observation interval of tij . These con- straints amount to tij Yi ∧ Yk and tkl Yi ∧ Yk , where ∧ denotes minimum. For any two event times tij and tkl , let δijkl = 1 if (tij , tkl ) is a comparable pair, and 0 otherwise. We condition on hav- ing observed the values {tij , tkl } for a given pair, but without knowing the order. We refer to this as conditioning on the order statistics of (tij , tkl ). The conditional distribution is de- generate at the observed values if (tij , tkl ) are not comparable. By conditioning on the order statistics of (tij , tkl ) and δijkl = 1, the pairwise pseudolikelihood of (tij , tkl ), i < k, is given by dπi (tij )dπk (tk l ) dπi (tij )dπk (tk l ) + dπi (tk l )dπk (tij ) = exp[{Xi (tij ) + Xk (tk l )} β] exp[{Xi (tij ) + Xk (tk l )} β + exp[{Xi (tk l ) + Xk (tij )} β] = 1 1 + exp{ρik (tij , tk l ) β} , (3) where ρik (t, u) = Xi (u) + Xk (t) − Xi (t) − Xk (u). Interest- ingly, the pairwise pseudolikelihood depends on the regression parameter β but not the nonparametric component λ0 . Hence β can be estimated by maximizing the pairwise pseudolikelihood i< k m i j =1 m k l=1 1 1 + exp{ρik (tij , tk l ) β} δi j k l . Lots of MATH and PROOF!! GOAL! R Package for Use!

Slide 1

Slide 1 text

Slide 2

Slide 2 text

Slide 3

Slide 3 text

Slide 4

Slide 4 text

Slide 5

Slide 5 text

Slide 6

Slide 6 text

Slide 7

Slide 7 text

Slide 8

Slide 8 text

Slide 9

Slide 9 text

Slide 10

Slide 10 text

Slide 11

Slide 11 text

Slide 12

Slide 12 text

Slide 13

Slide 13 text

Slide 14

Slide 14 text

Slide 15

Slide 15 text

Slide 16

Slide 16 text

Slide 17

Slide 17 text

Slide 18

Slide 18 text

Slide 19

Slide 19 text

Slide 20

Slide 20 text

Slide 21

Slide 21 text

Slide 22

Slide 22 text

Slide 23

Slide 23 text

Slide 24

Slide 24 text

Slide 25

Slide 25 text