numerical estimates of non analytically available marginal densities I Using of sampling methods instead of implementing sophisticated numerical analytic ones I More attractable because of their simplicity and ease of implementation
numerical estimates of non analytically available marginal densities I Using of sampling methods instead of implementing sophisticated numerical analytic ones I More attractable because of their simplicity and ease of implementation
numerical estimates of non analytically available marginal densities I Using of sampling methods instead of implementing sophisticated numerical analytic ones I More attractable because of their simplicity and ease of implementation
1, . . . , k the conditional distributions U i | U j ( j 6= i ) are available. I We can also consider the reduced forms Ui | Uj where j 2 Si ⇢ { 1 , . . . , k } I B) The functional form of the joint density of U1 , U2 , . . . , Uk is known and at least one U i | U j ( j 6= i ) is available
1, . . . , k the conditional distributions U i | U j ( j 6= i ) are available. I We can also consider the reduced forms Ui | Uj where j 2 Si ⇢ { 1 , . . . , k } I B) The functional form of the joint density of U1 , U2 , . . . , Uk is known and at least one U i | U j ( j 6= i ) is available
1, . . . , k the conditional distributions U i | U j ( j 6= i ) are available. I We can also consider the reduced forms Ui | Uj where j 2 Si ⇢ { 1 , . . . , k } I B) The functional form of the joint density of U1 , U2 , . . . , Uk is known and at least one U i | U j ( j 6= i ) is available
having a joint distribution whose density function is strictly positive over the sample space I Full set of conditional specifications uniquely defines the full joint density I Existence of densities with respect to either Lebesgue or counting measures for all marginal and conditional distributions
having a joint distribution whose density function is strictly positive over the sample space I Full set of conditional specifications uniquely defines the full joint density I Existence of densities with respect to either Lebesgue or counting measures for all marginal and conditional distributions
having a joint distribution whose density function is strictly positive over the sample space I Full set of conditional specifications uniquely defines the full joint density I Existence of densities with respect to either Lebesgue or counting measures for all marginal and conditional distributions
in finding fixed-pointed solutions to certain classes of integral equations I Its utility in statistical problems was developed by Tanner and Wong (1987) who called it a data-augmentation algorithm
in finding fixed-pointed solutions to certain classes of integral equations I Its utility in statistical problems was developed by Tanner and Wong (1987) who called it a data-augmentation algorithm
‘ development (1) [ X ] = ´ [ X | Y ] ⇤ [ Y ] (2) [ Y ] = ´ [ Y | X ] ⇤ [ X ] (3) [ X ] = ´ [ X | Y ] ⇤ ´ [ Y | X 0] ⇤ [ X 0] = ´ h ( X , X 0) ⇤ [ X 0] Where h ( X , X 0) = ´ [ X | Y ] ⇤ [ Y | X 0] With X 0 a dummy argument and [ X 0] s [ X ]
‘ development (1) [ X ] = ´ [ X | Y ] ⇤ [ Y ] (2) [ Y ] = ´ [ Y | X ] ⇤ [ X ] (3) [ X ] = ´ [ X | Y ] ⇤ ´ [ Y | X 0] ⇤ [ X 0] = ´ h ( X , X 0) ⇤ [ X 0] Where h ( X , X 0) = ´ [ X | Y ] ⇤ [ Y | X 0] With X 0 a dummy argument and [ X 0] s [ X ]
Z , three random variables : (4) [ X ] = ´ [ X , Z | Y ] ⇤ [ Y ] (5) [ Y ] = ´ [ Y , X | Z ] ⇤ [ Z ] (6) [ Z ] = ´ [ Z , Y | X ] ⇤ [ X ] By substitution : I [ X ] = ´ h ( X , X 0) ⇤ [ X 0] I Where h ( X , X 0) = ´ [ X , Z | Y ] ⇤ [ Y , X | Z ] ⇤ [ Z , Y | X 0] I With X 0 a dummy argument and [ X 0] s [ X ] Extension to k variables is straightforward.
Z , three random variables : (4) [ X ] = ´ [ X , Z | Y ] ⇤ [ Y ] (5) [ Y ] = ´ [ Y , X | Z ] ⇤ [ Z ] (6) [ Z ] = ´ [ Z , Y | X ] ⇤ [ X ] By substitution : I [ X ] = ´ h ( X , X 0) ⇤ [ X 0] I Where h ( X , X 0) = ´ [ X , Z | Y ] ⇤ [ Y , X | Z ] ⇤ [ Z , Y | X 0] I With X 0 a dummy argument and [ X 0] s [ X ] Extension to k variables is straightforward.
Z , three random variables : (4) [ X ] = ´ [ X , Z | Y ] ⇤ [ Y ] (5) [ Y ] = ´ [ Y , X | Z ] ⇤ [ Z ] (6) [ Z ] = ´ [ Z , Y | X ] ⇤ [ X ] By substitution : I [ X ] = ´ h ( X , X 0) ⇤ [ X 0] I Where h ( X , X 0) = ´ [ X , Z | Y ] ⇤ [ Y , X | Z ] ⇤ [ Z , Y | X 0] I With X 0 a dummy argument and [ X 0] s [ X ] Extension to k variables is straightforward.
each iteration i we generate m iid pairs ( X (i) j , Y (i) j ) ( j 2 {1, . . . , m } , i 2 {1, . . . , k }), we can estimate [ X ] with the Monte Carlo integration : h ˆ X i i = 1 m m X j= 1 h X | Y (i) j i
each iteration i we generate m iid pairs ( X (i) j , Y (i) j ) ( j 2 {1, . . . , m } , i 2 {1, . . . , k }), we can estimate [ X ] with the Monte Carlo integration : h ˆ X i i = 1 m m X j= 1 h X | Y (i) j i
convergence of h ˆ X i i to [ X ] since : ˆ | h ˆ X i i [ X ] | ˆ | h ˆ X i i [ X ]i | + ˆ | [ X ]i [ X ] | and I h ˆ X i i P ! [ X ]i when m ! 1 (Glick 1974) I [ X ]i L1 ! [ X ] when i ! 1 (TW2)
than two variables is straightforward. I In the three-variable-case with an arbitrary starting marginal density [ X ] 0 for X : I X (0) ⇠ [ X ]0 I ( Z (0)0 , Y (0)0 ) ⇠ ⇥ Z , Y | X (0) ⇤ I ( Y (1), X (0)0 ) ⇠ h Y , X | Z (0)0 i I ( X (1), Z (1)) ⇠ ⇥ X , Z | Y (1) ⇤ I Six generated variables required
i times produces ( X (i), Y (i), Z (i)) such that : I X (i) d ! X ⇠ [ X ] , Y (i) d ! Y ⇠ [ Y ] and Z (i) d ! Z ⇠ [ Z ] I At the i th iteration for m generations : I ( X (i) j , Y (i) j , Z (i) j ) iid samples I h ˆ X i i = 1 m m X j=1 h X | Y (i) j , Z (i) j i I The L1 convergence still follows
U1 , . . . , Uk, : I k ( k 1) random variate generations to complete one cycle I mik ( k 1) random generations for m sequences an i iterations I h ˆ U s i i = 1 m m X j= 1 h U s | U t = U (i) tj ; t 6= s i I The L1 convergence still follows
Geman (1984) to simulate marginal densities using full conditional distributions I [ X | Y , Z ] , [ Y | X , Z ] and [ Z | X , Y ] in the three-variable-case
by Geman and Geman (1984) to simulate marginal densities without using all conditional distributions (just the full ones) I [ X | Y , Z ] , [ Y | X , Z ] and [ Z | X , Y ] in the three-variable-case
scheme I With an arbitrary starting set of values U ( 0 ) 1 , . . . , U ( 0 ) k : I U (1) 1 ⇠ h U1 | U (0) 2 , . . . , U (0) k i I U (1) 2 ⇠ h U2 | U (1) 1 , U (0) 3 , . . . , U (0) k i I U (1) 3 ⇠ h U3 | U (1) 1 , U (1) 2 , U (0) 4 , . . . , U (0) k i I . . . I U (1) k ⇠ h Uk | U (1) 1 , . . . , U (1) k 1 i I K random variate generations required in a cycle I Afer i iterations =) ( U (i) 1 , . . . , U (i) k )
scheme I With an arbitrary starting set of values U ( 0 ) 1 , . . . , U ( 0 ) k : I U (1) 1 ⇠ h U1 | U (0) 2 , . . . , U (0) k i I U (1) 2 ⇠ h U2 | U (1) 1 , U (0) 3 , . . . , U (0) k i I U (1) 3 ⇠ h U3 | U (1) 1 , U (1) 2 , U (0) 4 , . . . , U (0) k i I . . . I U (1) k ⇠ h Uk | U (1) 1 , . . . , U (1) k 1 i I K random variate generations required in a cycle I Afer i iterations =) ( U (i) 1 , . . . , U (i) k )
scheme I With an arbitrary starting set of values U ( 0 ) 1 , . . . , U ( 0 ) k : I U (1) 1 ⇠ h U1 | U (0) 2 , . . . , U (0) k i I U (1) 2 ⇠ h U2 | U (1) 1 , U (0) 3 , . . . , U (0) k i I U (1) 3 ⇠ h U3 | U (1) 1 , U (1) 2 , U (0) 4 , . . . , U (0) k i I . . . I U (1) k ⇠ h Uk | U (1) 1 , . . . , U (1) k 1 i I K random variate generations required in a cycle I Afer i iterations =) ( U (i) 1 , . . . , U (i) k )
the sup norm, rather than the L1 norm, the joint density of ( U (i) 1 , . . . , U (i) k ) converges to the true joint density at a geometric rate in i.
For any measurable function T of U1 , . . . , Uk whose expectation exists, lim i!1 1 i i X l= 1 T ⇣ U (l) 1 , . . . , U (l) k ⌘ a.s. ! E ( T (( U1 , . . . , Uk))
Gibbs sampler are equivalent when only the set of full conditionnals is available. I If reduced conditional distributions are available, substitution sampling offers the possibility of acceleration relative to Gibbs sampling.
( 0 )0 ⇠ ⇥ Y | X ( 0 ), Z ( 0 ) ⇤ b) Z ( 0 )0 ⇠ h X | Y ( 0 )0 , X ( 0 ) i c) X ( 0 )0 ⇠ h Z | Z ( 0 )0 , Y ( 0 )0 i d) Y ( 1 ) ⇠ h Y | X ( 0 )0 , Z ( 0 )0 i e) Z ( 1 ) ⇠ h Z | Y ( 1 ), X ( 0 )0 i f) X ( 1 ) ⇠ ⇥ X | Z ( 1 ), Y ( 1 ) ⇤ If [ Z | Y ] is available, e) becomes Z ( 1 ) ⇠ ⇥ Z | Y ( 1 ) ⇤
( 0 )0 ⇠ ⇥ Y | X ( 0 ), Z ( 0 ) ⇤ b) Z ( 0 )0 ⇠ h X | Y ( 0 )0 , X ( 0 ) i c) X ( 0 )0 ⇠ h Z | Z ( 0 )0 , Y ( 0 )0 i d) Y ( 1 ) ⇠ h Y | X ( 0 )0 , Z ( 0 )0 i e) Z ( 1 ) ⇠ h Z | Y ( 1 ), X ( 0 )0 i f) X ( 1 ) ⇠ ⇥ X | Z ( 1 ), Y ( 1 ) ⇤ If [ Z | Y ] is available, e) becomes Z ( 1 ) ⇠ ⇥ Z | Y ( 1 ) ⇤
Jh = E g h ( X )f ( X ) g ( X ) = ˆ H h ( x )f ( x ) g ( x ) g ( x ) dx MC solution ¯ h m = 1 m m X i= 1 h ( x i )f ( x i ) g ( x i ) with ( x1 , . . . , x m) ⇠ g
I Choose an importance-sampling distribution [ Y ]s for Y I Use [ X | Y ] ⇤ [ Y ]s as an importance-sampling distribution for ( X , Y ) I ( Xl , Yl ) is created by drawing Yl ⇠ [ Y ]s and Xl ⇠ [ X | Yl ] ( l = 1 , . . . , N ) I Calculate rl = [ Xl , Yl ] [ Xl | Yl ] ⇤ [ Yl ]s
numerator and the denominator by N and using the law of large number, we obtain : Theorem R1 (convergence) h ˆ X i ! [ X ] with probability 1 as N ! 1 for almost every X
the three-variable case, we need : I The functional form of [ X , Y , Z ] I The availability of [ X | Y , Z ] I An importance- sampling distribution [ Y , Z ]s
a one-parameter genetic-linkage example I Some observations are not assigned to individual cells but to aggregates of cells : =) we use multinomial sampling
a one-parameter genetic-linkage example I Some observations are not assigned to individual cells but to aggregates of cells : =) we use multinomial sampling
have a three-variable case (✓, ⌘, Z ) with interest in the marginal distributions : I [✓ | Y ] , [⌘ | Y ] and [ Z | Y ] I We can remark that [ Z | Y , ✓, ⌘] is the product of four independent binomials X1 , X3 , X5 and X7 =)[ X i | Y , ✓, ⌘] = binomial ( Y i , ai ✓ (ai ✓+bi ) )
compare the two forms of iterative sampling, the authors : I obtained a numerical estimates of [✓ | Y ] and [⌘ | Y ] I processed 5,000 times the following scheme : I initialize : ✓ ⇠ U ( 0 , 1 ), ⌘ ⇠ U ( 0 , 1 ), 0 ✓ + ⌘ 1 I run 4 cycles of the two samplers with m = 10 I Compared the average cumulative posterior probabilities
note that : I Substitution sampler adapts more quickly than Gibbs sampler I By the time, the two samplers have the same performance I Few random variate generations are required to obtain convergence ( m=10)
requires : I [ Z | Y ]s to draw Zl I [⌘ | Y , Z ] to draw ⌘l I [✓ | ⌘, Z , Y ] to draw ✓l The rl ratio is given by : rl = [ Y , Zl | ✓l , ⌘l ] ⇤ [✓l , ⌘l ] [✓ | ⌘, Z , Y ] ⇤ [⌘ | Y , Z ] ⇤ [ Z | Y ]
obtained the following average cumulative posterior probabilities with: I 2,500 simulations I [ Z | Y ]s following the product of X1 ⇠ binomial ( Y1, 1 2 ) and X5 ⇠ binomial ( Y4, 1 2 )
Sampling After i iterations with m repetitions : I ( (i) 1 l , . . . , (i) pl , (i) l ) with l = 1, . . . , m I [ j | Y ] = 1 m m X l= 1 (↵ + s j , ( t j + 1 (i) l ) 1) I [ | Y ] = 1 m m X l= 1 IG ( + p ↵, X (i) jl + )
I Substitution and Gibbs sampling (iterative methods) provide better results in terms of convergence than the Rubin importance-sampling (noniterative method) I Performance of the Rubin importance-sampling performance depends on the choice of the importance distribution I If some reduced conditional distributions are available, substitution sampling becomes more efficient than Gibbs ones.
Distributions and the Bayesian Restoration of Images,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 6, 721-741. Glick, N. (1974), “Consistency Conditions for Probability Estimators and Integrals of Density Estimators,” Utilitas Mathematica, 6, 61-74. Rubin, D. B. (1987), Comment on “The Calculation of Posterior Distributions by Data Augmentation,” by M. A. Tanner and W. H. Wong, Journal of the American Statistical Association, 82, 543-546 Tanner,M., and Wong,W. (1987), “The Calculation of Posterior Distributions by Data Augmentation,” Journal of the American Statistical Association, 82, 528-550