Slide 1

Slide 1 text

Dong X et. al. Modeling gene expression using chromatin features in various cellular contexts 2012/09/29 ENCODE ษڧձ ೋ֊ಊѪ / @dritoshi #encodejp-28 ཧԽֶݚڀॴ ൃੜɾ࠶ੜՊֶ૯߹ݚڀηϯλʔ

Slide 2

Slide 2 text

͜ͷࢿྉͷ࠷৽൛ͷҎԼʹ͋Γ·͢ http://cat.hackingisbelieving.org/lecture/ ͜ͷϑΝΠϧ͸ ΫϦΤΠςΟϒɾίϞϯζ දࣔ 2.0 Ұൠ ϥΠηϯεͷԼʹఏڙ͞Ε͍ͯ·͢ɻ http://genomebiology.com/content/13/9/R53 “catway dritoshi” Ͱݕࡧ

Slide 3

Slide 3 text

Chromatin features ͔ΒҨ఻ࢠൃݱ͕༧ଌͰ͖Δ͔? 4ͭͷ໰͍ 1. Can we reproduce the quantitative relationship between gene expression levels and histone modifications? 2. Does the relationship hold across different human cell lines and between different groups of genes? 3. Do the most predictive chromatin features differ depending on the expression quantification technique used? 4. How well can the chromatin features predict expression levels of RNA from different cell compartments and/or RNA extracted by different

Slide 4

Slide 4 text

Chromatin features ͕సࣸΛ੍ޚ͢Δ Ҩ఻ࢠपลͷώετϯम০, ΫϩϚνϯߏ଄ʹΑΔస੍ࣸޚ 7 histone modifications 1 histone variant DNase I hypersensitivity in 7 cell Cause = Chromatin feature RNA-seq RNA-PET deepCAGE Effect = transcription Cause = Chromatin feature Effect = transcription

Slide 5

Slide 5 text

Chromatin featuresͷσʔλΛදݱ Ҩ఻ࢠपลͷώετϯम০, ΫϩϚνϯߏ଄ͷϕΫτϧΛ࡞Δ

Slide 6

Slide 6 text

RNAసࣸྔσʔλͷදݱ స͕ࣸON/OFFͷҨ఻ࢠʹ෼͚ͯ͠·͏ Random forests ͰON/OFFͷ2܈ʹ෼ྨ͢Δ

Slide 7

Slide 7 text

ϞσϧԽ͢Δ ճؼϞσϧ 1. Linear regression 2. multivariate adaptive regression splines (MARS) 3. Random forests

Slide 8

Slide 8 text

ϞσϧΛ౷߹͢Δ ෼ྨ*ճؼϞσϧ

Slide 9

Slide 9 text

ϞσϧΛධՁ͢Δ ༧ଌͱ࣮ଌΛൺֱ͢Δ

Slide 10

Slide 10 text

༧ଌੑೳ1 ༧ଌͱ࣮ଌΛൺֱ͢Δ

Slide 11

Slide 11 text

༧ଌੑೳ2 ༧ଌͱ࣮ଌΛൺֱ͢Δ

Slide 12

Slide 12 text

·ͱΊ1 chromatin features ͔Βసࣸྔ͕ྑ͘༧ଌͰ͖Δ 0. chromatin features ͷసࣸ΁ͷӨڹΛఆྔతʹධՁͰ͖ͨ ൃݱON/OFF: H3K9ac > H3K4me3 > DNase I > H3K4me2 ... సࣸྔ: H3K79me2 > H3K36me3 > DNase I > H3K9ac ... 1. 2ஈ֊ͷ༧ଌํ๏ΛఏҊ ON/OFFͷ෼ྨͱճؼϞσϧ

Slide 13

Slide 13 text

·ͱΊ2 ΄͔ʹٞ࿦͞Ε͍ͯΔ͜ͱ 1. Nucleus, Cytosol, Whole Cell ༝དྷͷRNAྔΛ༧ଌͰ͖Δ͔? Ͱ͖ΔɻCytosol > Whole Cell >> Nucleus 2. RNA-seq, CAGE, RNA-PETͷͲΕ͕chromatin featuresͱͷ૬ ͕ؔߴ͍͔? CAGE > RNA-PET = RNA-seq 3. ΄͔ͷࡉ๔ͷసࣸྔΛઆ໌Ͱ͖Δ͔? R = 0.8 ఔ౓Ͱ 4. CpGͱͷؔ࿈͸? High CpG ͷ΄͏͕༧ଌ͕Α͍

Slide 14

Slide 14 text

͜͏ߟ͑Δ Α͔ͬͨ͜ͱͱ࢒͞Εͨ՝୊ Α͔ͬͨ͜ͱ 1. ༧ଌͰ͖ͨͷ͸Α͔ͬͨͶ 2. chromatin features ͷͦΕͧΕͷॏཁੑ͕Θ͔ͬͨͷ͸Α͔ ͬͨ ՝୊ 1. bestbin Λ૬ؔͰऔ͍ͬͯΔͷ͸͍͍ͷ͔ͳ? 2. ౷ܭϞσϧͰ͍͍ͷ͔ͳ? Ϟϊͷಈ͖͕Θ͔Βͳ͍ 3. CAGE, RNA-seq, RNA-PET͕ൺֱͰ͖Δ΄ͲϑΣΞ? 4. Ҩ఻ࢠʹண໨ͨٞ͠࿦΋ཉ͍͠ΑͶ 5. ༧ଌ݁Ռ͔ΒసࣸΛσβΠϯͰ͖Δͷ͔?

Slide 15

Slide 15 text

Software ෼ྨ΍ճؼͳͲ Calculation of the mean density of chromatin features bigWigSummary: BigWig and BigBed: enabling browsing of large distributed datasets Variable importance relaimpo: Relative importance of regressors in linear models Regression/classification randomForest: Breiman and Cutler's random forests for classification and regression Regression earth: Multivariate Adaptive Regression Spline Models

Slide 16

Slide 16 text

͜ͷࢿྉͷ࠷৽൛ͷҎԼʹ͋Γ·͢ http://cat.hackingisbelieving.org/lecture/ ͜ͷϑΝΠϧ͸ ΫϦΤΠςΟϒɾίϞϯζ දࣔ 2.0 Ұൠ ϥΠηϯεͷԼʹఏڙ͞Ε͍ͯ·͢ɻ http://genomebiology.com/content/13/9/R53 “catway dritoshi” Ͱݕࡧ