Slide 1

Slide 1 text

࣍ੈ୅γʔέϯαʔΛ׆༻ͨ͠ݚڀࣄྫͱɺ ͦΕΛࢧ͑Δެڞπʔϧɾσʔλϕʔε +BOBUԽ݂ݚ %BUBCBTF$FOUFSGPS-JGF4DJFODF େాୡ࿠5B[SP0IUB Introduction of next-generation sequencing applications, related public databases and resources

Slide 2

Slide 2 text

ߨशձͷ໨ඪ • NGSΛ࢖ͬͯͰ͖Δ͜ͱ/Ͱ͖ͳ͍͜ͱΛ஌Δ! • NGSΛར༻ͨ͠ݚڀࣄྫͱެ։σʔλΛݕࡧͰ͖ΔΑ͏ʹͳΔ! • NGSͷσʔλղੳͷਐΊํΛ஌Δ

Slide 3

Slide 3 text

/(4Λ࢖ͬͯͰ͖Δ͜ͱͰ͖ͳ͍͜ͱ High-throughput sequencing: could it be the silver bullet?

Slide 4

Slide 4 text

/(4ͰͰ͖Δ͜ͱͰ͖ͳ͍͜ͱ • NGSͱ͸ͦ΋ͦ΋ͳΜͳͷ͔! • NGSͷγʔέϯεݪཧ! • ػցʹ͍ͭͯ! • ݚڀ෼໺΁ͷԠ༻ (ΞϓϦέʔγϣϯ)! • ҰൠతͳNGSΛར༻ͨ͠ݚڀͷྲྀΕ! • ެڞNGSσʔλϕʔε SRAʹ͍ͭͯ

Slide 5

Slide 5 text

/(4ͱ͸ͦ΋ͦ΋ͳΜͳͷ͔ (next)n-generation

Slide 6

Slide 6 text

/(4ͱ͸ͦ΋ͦ΋ͳΜͳͷ͔ • Next-generation Sequencing! • “The high demand for low-cost sequencing has driven the development of high-throughput sequencing (or next-generation sequencing) technologies that parallelize the sequencing process, producing thousands or millions of sequences concurrently” from http://en.wikipedia.org/wiki/Next- generation_sequencing#Next-generation_methods! ! • High-throughput Sequencing, Massively Parallel Sequencing..! • ैདྷ๏ͱͷൺֱͰ͋Γ୯ҰͷγʔέϯεݪཧΛࢦ͢΋ͷͰ͸ͳ͍! • ୈnੈ୅ͱ͍͏ݺͼํΛ͢Δਓ΋͍Δ

Slide 7

Slide 7 text

/(4ͷγʔέϯγϯάݪཧʹ͍ͭͯ How it works

Slide 8

Slide 8 text

/(4ݪཧҰཡ֤γʔέϯαʔϕϯμʔͷϥΠϯφοϓ • Roche 454! • Illumina HiSeq/MiSeq! • LifeTech SOLiD! • LifeTech IonTorrent/IonProton! • PacBio RS! • ͦͷଞ! • Oxford Nanopore MinIon/GridIon! • GnuBio

Slide 9

Slide 9 text

3PDIF XXXDPN

Slide 10

Slide 10 text

3PDIF XXXDPN • PyroSequencing! • http://454.com/products/technology.asp! ! • 2016೥ʹαϙʔτΛऴྃ͢Δ͜ͱ͕༧ࠂ͞Ε͍ͯΔ (2013/10)! • http://www.genomeweb.com/sequencing/roche-shutting-down-454- sequencing-business! • http://www.bio-itworld.com/BioIT_Article.aspx?id=131053

Slide 11

Slide 11 text

*MMVNJOB)J4FR.J4FR XXXJMMVNJOBDPN

Slide 12

Slide 12 text

*MMVNJOB)J4FR.J4FR XXXJMMVNJOBDPN • Sequence By Synthesis (SBS)! • http://res.illumina.com/documents/products/techspotlights/ techspotlight_sequencing.pdf! ! • Update: Achieved $1000Genome w/ HiSeq X Ten, announced (15 Jan. 2014)! • HiSeq X! • http://nextgenseek.com/2014/01/how-does-a-single-hiseq-x- compares-with-hiseq-2500/! ! • NextSeq 500! • http://nextgenseek.com/2014/01/how-does-nextseq-500-compare- with-miseq-and-hiseq/

Slide 13

Slide 13 text

͓஋ஈ https://twitter.com/dritoshi/status/426178011080048641

Slide 14

Slide 14 text

-JGF5FDIc"QQMJFE#JPTZTUFNT40-J% IUUQXXXBQQMJFECJPTZTUFNTDPNBCTJUFVTFOIPNFBQQMJDBUJPOTUFDIOPMPHJFTTPMJEOFYUHFOFSBUJPOTFRVFODJOHIUNM

Slide 15

Slide 15 text

• Sequence By Ligation! • SOLiD: Sequencing Oligonucleotide Ligation and Detection! • http://www.appliedbiosystems.com/absite/us/en/home/applications- technologies/solid-next-generation-sequencing/next-generation-systems/ solid-sequencing-chemistry.html! ! • IonTorrent/IonProton ʹஔ͖׵ΘΓͭͭ͋Δ -JGF5FDIc"QQMJFE#JPTZTUFNT40-J% IUUQXXXBQQMJFECJPTZTUFNTDPNBCTJUFVTFOIPNFBQQMJDBUJPOTUFDIOPMPHJFTTPMJEOFYUHFOFSBUJPOTFRVFODJOHIUNM

Slide 16

Slide 16 text

-JGF5FDI*PO5PSSFOU*PO1SPUPO IUUQTXXXMJGFUFDIOPMPHJFTDPNVTFOIPNFMJGFTDJFODFTFRVFODJOHOFYUHFOFSBUJPOTFRVFODJOHIUNM

Slide 17

Slide 17 text

• Semiconductor Sequencing Technology! • http://www.lifetechnologies.com/us/en/home/life-science/sequencing/next- generation-sequencing/ion-torrent-next-generation-sequencing- technology.html! ! • ൒ಋମνοϓͷੑೳ޲্͕ͦͷ··γʔέϯαʔͷੑೳ޲্ʹ -JGF5FDI*PO5PSSFOU*PO1SPUPO IUUQTXXXMJGFUFDIOPMPHJFTDPNVTFOIPNFMJGFTDJFODFTFRVFODJOHOFYUHFOFSBUJPOTFRVFODJOHIUNM

Slide 18

Slide 18 text

1BDJpD#JPTDJFODFT1BD#JP34** XXXQBDJpDCJPTDJFODFTDPN

Slide 19

Slide 19 text

• SMRT Technology! • http://www.pacificbiosciences.com/products/smrt-technology/! ! • ਺ઍϕʔεͷ௕͍ϦʔυΛಡΉ͜ͱͷͰ͖Δ࠷ॳͷNGS! • ଞͷγʔέϯαʹൺ΂ਫ਼౓͕௿͍! • Τϥʔิਖ਼ͷπʔϧͳͲΛར༻͢Δ͜ͱͰվળՄೳ! • IlluminaͳͲͷߴਫ਼౓ͳγʔέϯαͱ૊Έ߹Θͤͯิਖ਼΋ 1BDJpD#JPTDJFODFT1BD#JP34** XXXQBDJpDCJPTDJFODFTDPN

Slide 20

Slide 20 text

6QDPNJOH4FRVFODFST0YGPSE/BOPQPSF.JO*0/(SJE*PO XXXOBOPQPSFUFDIDPN

Slide 21

Slide 21 text

• nanopore sensing technology! • https://www.nanoporetech.com/technology/introduction-to-nanopore-sensing/ introduction-to-nanopore-sensing! ! • MinIon: USBن֨Ͱίϯϐϡʔλʹ઀ଓ͢Δখܕγʔέϯαʔ! • GridIon: ௒ฒྻܕφϊϙΞγʔέϯαʔ 6QDPNJOH4FRVFODFST0YGPSE/BOPQPSF.JO*0/(SJE*PO XXXOBOPQPSFUFDIDPN

Slide 22

Slide 22 text

6QDPNJOH4FRVFODFSTHOVCJP HOVCJPDPN

Slide 23

Slide 23 text

6QDPNJOH4FRVFODFSTHOVCJP HOVCJPDPN • Beta Systems! • Clinical Sequencing! • Sample-to-Answer instrument

Slide 24

Slide 24 text

4FRVFODFSTCZ3FBETQFD ͬ͘͟Γ Ϧʔυ਺ Ϧʔυ௕ JMMVNJOB)J4FR JMMVNJOB.J4FR 3PDIF 1BD#JP34 *PO5PSSFOU*PO1SPUPO ୹࠯௒ฒྻܕ ϕϯντοϓܕ ௕࠯ฒྻܕ

Slide 25

Slide 25 text

4FRVFODJOH3FBE-BZPVUʹ͍ͭͯ • Single-end ͔ Paired-end (Mate-Pair)͔ʁ! • mapping/assembleͷਫ਼౓ͱՁ֨Λߟྀͯ͠બ୒ IUUQSFTJMMVNJOBDPNJNBHFTUFDIOPMPHZQBJSFEFOETFRVFODJOHpHVSFHJG

Slide 26

Slide 26 text

ݚڀ෼໺΁ͷԠ༻ Various sequencing application

Slide 27

Slide 27 text

ݚڀ෼໺΁ͷԠ༻4FRVFODJOH"QQMJDBUJPO 4UVEZ5ZQFͱ-JCSBSZ4USBUFHZ͕ࠞಉ͕ͪ͠ͳͷͰ஫ҙ • ୅දతͳStudyType! • Whole Genome Sequencing! • Exome! • Population Genomics! • Transcriptome! • Epigenetics (Gene regulation study)! • Metagenomics! • Other

Slide 28

Slide 28 text

ݚڀ෼໺΁ͷԠ༻4FRVFODJOH"QQMJDBUJPO BQQMJDBUJPO͝ͱͷҧ͍ • ϦϑΝϨϯεήϊϜͷ༗ແʹΑͬͯγʔέϯεޙͷσʔλॲཧ͕ҟͳΔ! • Mappingܕ (Splice Alignmentܕ)! • γʔέϯα͔ΒಘΒΕͨ୹͍Ԙج഑ྻ(Ϧʔυ)Λ
 ϦϑΝϨϯεήϊϜʹ഑ྻ૬ಉੑΛݩʹష͍ͬͯ͘(mapping)! • de novo Assembleܕ! • Ϧʔυಉ࢜ͷ഑ྻ૬ಉੑΛݩʹ୹͍ϦʔυΛܨ͍Ͱ
 ௕͍Ϧʔυʹassemble͢Δ

Slide 29

Slide 29 text

8IPMF(FOPNF4FRVFODJOH • de novo genome sequencing! • “Sequencing of a single organism”! • ৽نήϊϜ! • ήϊϜͷಡ·Ε͍ͯͳ͍ੜ෺ͷήϊϜΛθϩ͔ΒNGSͰߏங͢Δ! • Assembleܕ! • Resequencing! • “Sequencing of a sample with respect to a reference”! • ήϊϜͷಡ·Εͨੜ෺ʹ͍ͭͯଟܕղੳ΍ൺֱήϊϜղੳΛߦ͏! • Mappingܕ

Slide 30

Slide 30 text

&YPNF4FRVFODJOH • “The study investigates the exons of the genome”! • 1ݸମ͋ͨΓͷγʔέϯεྔ͕গͳͯ͘ࡁΉ! • ҰԘجଟܕͷ৘ใͳͲ͕ॏཁͳ͜ͱ͕ଟ͍ͷͰγʔέϯεਫ਼౓͕ॏཁ! • Illumina TruSeq, Agilent SureSelect ͳͲͷࢼༀ͕୅දత! • Update: Illumina TruSeq͸Nextera Rapid Capture Exomeʹมߋ! • http://www.illuminakk.co.jp/products/truseq_exome_enrichment_kit.ilmn! • http://www.illuminakk.co.jp/products/nextera-rapid-capture-exome-kits.ilmn! • ࣬ױݪҼҨ఻ࢠͷ୳ࡧʹ༻͍ΒΕΔ͜ͱ͕ଟ͍

Slide 31

Slide 31 text

1PQVMBUJPO(FOPNJDT • “Study of populations and evolution through genomics”! • ूஂҨ఻ֶɼ౷ܭӸֶͳͲ! • جຊతʹ͸ର৅͸ώτɼmappingܕͷղੳ! • Exomeͱಉ͘͡γʔέϯεਫ਼౓͕ॏཁ! • 1000 Genomes ProjectͳͲͷࠃࡍϓϩδΣΫτ͕୅දత! • www.1000genomes.org

Slide 32

Slide 32 text

(FOPNFT1SPKFDU HFOPNFTPSH

Slide 33

Slide 33 text

5SBOTDSJQUPNF • “Sequencing and characterization of transcription elements”! • RNA-Seq, miRNA-Seq, meta-transcriptomeͳͲ! • ϦϑΝϨϯεʹMapping͢Δ৔߹ͱde novo assembleΛ͢Δ৔߹! • ൃݱྔΛఆྔ͢Δ: େྔͷϦʔυΛmapping! • ϦϑΝϨϯεήϊϜ͕ͳ͍ੜ෺Ͱ΋ൃݱղੳΛߦ͏: assemble! • ϚΠΫϩΞϨΠͱΑ͘ൺֱ͞ΕΔԠ༻ٕज़! • ϝϦοτ: ఆྔੑͷߴ͞ɼμΠφϛοΫϨϯδͷ޿͞ɼղ૾౓ͷߴ͞! • σϝϦοτ: ϚΠΫϩΞϨΠͱػցͷ࢖͍ํɼղੳͷ࢓ํ͕ҧ͏! • ඍྔԽϒʔϜ͕౸དྷ͍ͯ͠Δ! • Quartz-Seq, Smart-SeqͳͲͷख๏Ͱ1ࡉ๔RNA-Seq

Slide 34

Slide 34 text

ඍྔ3/"4FR2VBSU[4FR!ཧݚೋ֊ಊݚ CJUBDDDSJLFOKQQSPUPDPMT

Slide 35

Slide 35 text

$BQ3X$-*1TFR େن໛ͳ3/"ͷσʔλ͔Β഑ྻ͚ͩͰͳ͘ߏ଄Λ໌Β͔ʹ͢Δ IUUQHFOPNFCJPMPHZDPN3

Slide 36

Slide 36 text

&QJHFOFUJDT (FOFSFHVMBUJPOTUVEZ • “Cellular differentiation study/Study of gene expression regulation”! • ChIP-Seq! • “Direct sequencing of chromatin immunoprecipitates”! • Bisulfite-Seq! • “Sequencing following treatment of DNA with bisulfite to convert cytosine residues
 to uracil depending on methylation status"! • DNase-Seq! • “Sequencing of hypersensitive sites, or segments of open chromatin 
 that are more readily cleaved by DNaseI."! • FAIRE-Seq! • “Formaldehyde-Assisted Isolation of Regulatory Elements"! • etc, etc..! • શͯϦϑΝϨϯεήϊϜ΁ͷmappingͰఆྔΛߦ͏

Slide 37

Slide 37 text

.FUBHFOPNJDT • “Sequencing of a community”! • ώτڞੜࡉە(ޱ಺ɼ௎಺ɼetc.)! • ؀ڥϝλήϊϜ! • େؾ! • ւ༸! • ౔৕! • ৯඼! • de novo assembleΛߦ͏৔߹͕΄ͱΜͲ! • ϦʔυΛܨ͍ͩͷͪʹBLASTͰΞϊςʔγϣϯΛࢼΈΔ! • աڈͷݚڀࣄྫʹ͍ͭͯMicrobeDB.jp͕ৄ͍͠! • microbedb.jp

Slide 38

Slide 38 text

.JDSPCF%#+1 NJDSPCFECKQ

Slide 39

Slide 39 text

0UIFS • Pooled Clone Sequencing! • “The study is sequencing clone pools (BACs, fosmids, other constructs)”! • Synthetic genomics! • “Sequencing of modified, synthetic, or transplanted genomes”

Slide 40

Slide 40 text

όʔίʔυγʔέϯε GSPN߬฼ίϩΩΞϜ IUUQZFBTUDPMMPRVJVNXPSEQSFTTDPN

Slide 41

Slide 41 text

όʔίʔυγʔέϯε GSPN߬฼ίϩΩΞϜ IUUQZFBTUDPMMPRVJVNXPSEQSFTTDPN

Slide 42

Slide 42 text

ΞϓϦέʔγϣϯผʹඞཁͳϦʔυεϖοΫ application / ࣮ݧछ total bases / ૯Ԙج਺ read length / Ϧʔυ௕ read number (M) / Ϧʔυ਺ ώτήϊϜϦγʔέϯε 90-150Gb 2x100 900-1500 λʔήοτϦγʔέϯε <1Gb 2x100 10 exome sequence 5~7Gb 2x100 70 RNA-Seq 5Gb 2x100 50 TSS-Seq 1Gb 1x50 20 small RNA 0.35Gb 1x35 >10 ඍੜ෺ήϊϜ >150Mb 2x100 >1.5 ਅ֩ੜ෺ήϊϜ >4Gb 2x100 >40 Bisulfite-Seq 90-150Gb 2x100 900-1500 ChIP-Seq >6Gb 1x100 60 ࡉ๔޻ֶผ࡭࣍ੈ୅γʔέϯαʔ໨తผΞυόϯετϝιουQΑΓҾ༻ ஫ର৅ͷήϊϜαΠζͳͲͰ਺ࣈ͕มΘΔ͜ͱ͕͋Γ·͢ɽ·ͨɼطʹ৘ใ͕ݹ͘ͳ͍ͬͯΔՄೳੑ΋͋Γ·͢

Slide 43

Slide 43 text

4FRVFODFSTCZ3FBETQFD ͬ͘͟Γ Ϧʔυ਺ Ϧʔυ௕ JMMVNJOB)J4FR JMMVNJOB.J4FR 3PDIF 1BD#JP34 *PO5PSSFOU*PO1SPUPO ୹࠯௒ฒྻܕ ϕϯντοϓܕ ௕࠯ฒྻܕ

Slide 44

Slide 44 text

ݚڀ෼໺΁ͷԠ༻4FRVFODJOH"QQMJDBUJPO • ख๏͸ઈ͑ͣਐา͍ͯ͠Δ! • ࠷৽ͷϨϏϡʔΛಡΉͷ͕Ұ൪! • Nature Reviews Genetics: Application of next-generation sequencing! • www.nature.com/nrg/series/nextgeneration/! ! • ॻ੶ͳͲͰͷ৘ใ΋ঃʑʹग़࢝Ί͍ͯΔ! • ৘ใ͕ݹ͘ͳ͍ͬͯΔՄೳੑ΋! • ݚڀऀίϛϡχςΟͰ࣮ࡍʹ΍͍ͬͯΔਓΛั·͑Δͷ΋͓͢͢Ί! • NGSݱ৔ͷձ

Slide 45

Slide 45 text

࣍ੈ୅γʔΫΤϯαʔ໨తผΞυόϯετϝιου ४උಋೖ ώτήϊϜղੳ Ҩ఻ࢠൃݱ੍ޚղੳ ৽نήϊϜ഑ྻܾఆ ΤϐδΣωςΟΫεղੳ ϝλήϊϜղੳ ήϊϜߏ଄ղੳ σʔλղੳπʔϧˍอଘ ౷߹ղੳ એ఻Ͱ͕͢པ·Ε͍ͯΔΘ͚Ͱ΋ചΕΔͱ๻ʹ͓͕ۚೖΔΘ͚Ͱ΋͋Γ·ͤΜ

Slide 46

Slide 46 text

͓ՈͰͰ͖Δ.BD#PPLͰ΍Δ࣍ੈ୅γʔέϯεσʔλղੳ ిࢠॻ੶

Slide 47

Slide 47 text

/(4ݱ৔ͷձ XXXOHTpFMEPSH

Slide 48

Slide 48 text

/(4ݱ৔ͷձ UXJUUFSDPNOHTpFME

Slide 49

Slide 49 text

/(4Λར༻ͨ͠ݚڀͷҰൠతͳྲྀΕ NGS practical workflow

Slide 50

Slide 50 text

/(4Λར༻ͨ͠ݚڀͷҰൠతͳྲྀΕ /(4Λ࢖͏ݚڀ͸Կ͕େมͳͷ͔ αϯϓϦϯά ϥΠϒϥϦϓϨοϓ γʔέϯγϯά σʔλղੳ • Πϝʔδ! • ػց͕ߴ͍! • σʔλ͕୔ࢁग़Δ! • σʔλղੳ͕Α͘Θ͔Βͳ͍

Slide 51

Slide 51 text

/(4Λར༻ͨ͠ݚڀͷҰൠతͳྲྀΕ /(4Λ࢖͏ݚڀ͸Կ͕େมͳͷ͔ ࣮ݧσβΠϯ ༧උ࣮ݧ αϯϓϦϯά %/"ௐ੔ • ࣮ࡍ! • ҿΈձ͕ԕ͍ ϥΠϒϥϦ࡞੒ γʔέϯε 2$ ϑΟϧλϦϯά NBQQJOHBTTFNCMF 2$ ໨తผղੳ ֬ೝ࣮ݧ ࿦จࣥච σʔλެ։ ࿦จ౤ߘ ϦόΠζ࠶ղੳ ΞΫηϓτ ҿΈձ

Slide 52

Slide 52 text

/(4Λར༻ͨ͠ݚڀͷҰൠతͳྲྀΕ /(4Λ࢖͏ݚڀ͸Կ͕େมͳͷ͔ ࣮ݧσβΠϯ ༧උ࣮ݧ αϯϓϦϯά %/"ௐ੔ • ʮޙ໭Γ͕Ͱ͖ͳ͍ʯϙΠϯτ͕͋Δ! • γʔέϯεͷ݁Ռ͕ѱ͍ͱσʔλղੳͰ͸Ͳ͏ʹ΋ͳΒͳ͍! • ࠶ղੳʹ͕͔͔࣌ؒΔ৔߹ʹϦόΠζͷظݶʹؒʹ߹Θͳ͍৔߹͕͋Δ ϥΠϒϥϦ࡞੒ γʔέϯε 2$ ϑΟϧλϦϯά NBQQJOHBTTFNCMF 2$ ໨తผղੳ ֬ೝ࣮ݧ ࿦จࣥච σʔλެ։ ࿦จ౤ߘ ϦόΠζ࠶ղੳ ΞΫηϓτ ҿΈձ

Slide 53

Slide 53 text

/(4Λར༻ͨ͠ݚڀͷҰൠతͳྲྀΕ /(4Λ࢖͏ݚڀ͸Կ͕େมͳͷ͔ ࣮ݧσβΠϯ ༧උ࣮ݧ αϯϓϦϯά %/"ௐ੔ • ༧උ࣮ݧɼ֬ೝ࣮ݧΛؚΊͨσβΠϯ͕ඇৗʹॏཁ! • DNAΛߴ७౓Ͱௐ੔͢ΔͳͲ΢Σοτͷٕज़΋ඞཁ! • PCRόΠΞεͷͳ͍֬ೝ࣮ݧΛσβΠϯ͓ͯ͘͠ͳͲͷ४උ΋ඞཁ ϥΠϒϥϦ࡞੒ γʔέϯε 2$ ϑΟϧλϦϯά NBQQJOHBTTFNCMF 2$ ໨తผղੳ ֬ೝ࣮ݧ ࿦จࣥච σʔλެ։ ࿦จ౤ߘ ϦόΠζ࠶ղੳ ΞΫηϓτ ҿΈձ

Slide 54

Slide 54 text

/(4Λར༻ͨ͠ݚڀͷҰൠతͳྲྀΕ /(4Λ࢖͏ݚڀ͸Կ͕େมͳͷ͔ ࣮ݧσβΠϯ ༧උ࣮ݧ αϯϓϦϯά %/"ௐ੔ • ଟ͘ͷδϟʔφϧͰ࿦จ౤ߘલʹσʔλͷެ։͕ٻΊΒΕΔ! • NGSσʔλͷެ։͸ҙ֎ͱେม! • ͦ΋ͦ΋Ͳ͜Ͱެ։͢Ε͹͍͍ͷ͔ʁ! • NGSͷެڞσʔλϕʔε͕͋Γ·͢ ϥΠϒϥϦ࡞੒ γʔέϯε 2$ ϑΟϧλϦϯά NBQQJOHBTTFNCMF 2$ ໨తผղੳ ֬ೝ࣮ݧ ࿦จࣥච σʔλެ։ ࿦จ౤ߘ ϦόΠζ࠶ղੳ ΞΫηϓτ ҿΈձ

Slide 55

Slide 55 text

ެڞ/(4σʔλϕʔε43"ʹ͍ͭͯ formally Short Read Archive, current Sequence Read Archive

Slide 56

Slide 56 text

δϟʔφϧͷΨΠυϥΠϯʹσʔλެ։͸໌ه͞Ε͍ͯΔ http://www.plosone.org/static/publication#data%20report

Slide 57

Slide 57 text

4FRVFODF3FBE"SDIJWF 43" • NGSͷσʔλϨϙδτϦ • NCBI, EBI, DDBJͷ3ہͰڞಉӡ༻͞ΕΔ • γʔέϯα͔ΒಘΒΕͨੜͷ഑ྻσʔλ
 (લॲཧΛߦ͍ͬͯͳ͍fastqϑΝΠϧ) ͕ొ࿥͞ΕΔ • ଏʹݴ͏NGSͷσʔλ͸શͯ͜ͷDBʹొ࿥͢Δ͜ͱʹͳ͍ͬͯΔ • γʔέϯαͷछྨ (Illumina, Roche, LifeTech, etc) ͸໰Θͳ͍ • ΞϓϦέʔγϣϯͷछྨ (DNA-Seq, RNA-Seq, ChIP-Seq, etc.) ΋໰Θͳ͍ • ੜ෺छ΋໰Θͳ͍

Slide 58

Slide 58 text

σʔλͷϑΥʔϚοτʹ͍ͭͯ ࣮ݧσβΠϯ ༧උ࣮ݧ αϯϓϦϯά %/"ௐ੔ • ֤γʔέϯα͔Βग़Δσʔλ͸ҰൠతʹfastqϑΥʔϚοτʹม׵͞ΕΔ! • http://en.wikipedia.org/wiki/FASTQ_format! • mapping͞Εͨσʔλ͸.sam/.bamʹɺassemble͞Εͨσʔλ͸.fastaʹ! • ໨తผղੳޙ͸ͦΕͧΕͷϑΥʔϚοτʹม׵͞ΕՄࢹԽͳͲʹ༻͍ΒΕΔ ϥΠϒϥϦ࡞੒ γʔέϯε 2$ ϑΟϧλϦϯά NBQQJOHBTTFNCMF 2$ ໨తผղੳ ֬ೝ࣮ݧ ࿦จࣥච σʔλެ։ ࿦จ౤ߘ ϦόΠζ࠶ղੳ ΞΫηϓτ ҿΈձ GBTUR TBNCBN GBTUB H⒎WDGXJHFUD

Slide 59

Slide 59 text

%#$-4େن໛σʔλٕज़։ൃ෦໳/(4ಛघ෦ୂ • NGSͷσʔλ૿ՃʹઌखΛଧͭ໨తͰ࢝ಈ • ެڞͷσʔλϕʔεʹ͜Ε·Ͱʹͳ͍ڊେͳσʔλ͕େྔʹొ࿥͞ΕΔΑ͏ʹͳΔ • େن໛ͳσʔλʹಛ༗ͷ໰୊Λղܾ͢ΔͨΊͷݚڀɾٕज़։ൃΛߦ͏ • DDBJͱڠྗͯ͠SRAͷ׆ಈΛٕज़໘Ͱαϙʔτ

Slide 60

Slide 60 text

%#$-443" IUUQTSBECDMTKQ • σʔλϕʔεͷछʑͷ౷ܭ৘ใΛऔಘɾఏڙ • ͦ΋ͦ΋ͲΜͳσʔλ͕ೖ͍ͬͯΔ͔ʁ • σʔλݕࡧͷͨΊͷݕࡧγεςϜͷߏங • ଞͷDBͱͷ౷߹ (PubMed΍PMCͳͲͷจݙ৘ใɼtaxonomy, ࣬ױ৘ใͳͲ) • ݸผͷ഑ྻ৘ใΛݩʹͨ͠γʔέϯγϯάٕज़ͷಈ޲ௐࠪͳͲ

Slide 61

Slide 61 text

%#$-443" IUUQTSBECDMTKQ

Slide 62

Slide 62 text

࿦จͱެ։σʔλͷϚονϯά IUUQTSBECDMTKQDHJCJOQVCMJDBUJPODHJ

Slide 63

Slide 63 text

ʮ࿦จʹ࢖ΘΕͨσʔλ͚͕ͩొ࿥͞ΕΔʯΘ͚Ͱ͸ͳ͍ 0 37500 75000 112500 150000 total publication #submission 0 50000 100000 150000 200000 total publication #sample 0 100000 200000 300000 400000 total publication #run 115440 3059 194338 31787 376904 51202 26.5% 16.4% 13.6%

Slide 64

Slide 64 text

ͲͷγʔέϯαΛ࢖͑͹࿦จʹͳΔͷ͔ total publication 148946 Illumina HiSeq 2000 16481 Illumina Genome Analyzer II 65158 Illumina Genome Analyzer II 10944 Illumina Genome Analyzer 33042 454 GS FLX Titanium 5314 454 GS FLX Titanium 22010 Illumina Genome Analyzer 5307 454 GS FLX 18290 Illumina Genome Analyzer IIx 4659 Illumina Genome Analyzer IIx 16361 454 GS FLX 3973 Illumina HiSeq 2000 5495 AB SOLiD System 2.0 1388 PacBio RS 4726 unspecified 575 AB SOLiD System 3.0 4300 PacBio RS 561 Illumina HiSeq 1000 3911 Illumina MiSeq 340 Helicos HeliScope 0 37500 75000 112500 150000 Illumina HiSeq 2000 Illumina Genome Analyzer Illumina MiSeq Helicos HeliScope total 0 5000 10000 15000 20000 Illumina HiSeq 2000 454 GS FLX AB SOLiD System 3.0 publication

Slide 65

Slide 65 text

ʮήϊϜΛಡΊ͹࿦จʹͳΔʯ͸ਅͳͷ͔ total publication 267185 GENOMIC 33825 GENOMIC 38804 TRANSCRIPTOMIC 12892 TRANSCRIPTOMIC 16731 METAGENOMIC 1913 METAGENOMIC 4412 OTHER 1481 OTHER 2912 SYNTHETIC 295 VIRAL RNA 941 VIRAL RNA 119 SYNTHETIC 290 METATRANSCRIPTOMIC 48 METATRANSCRIPTOMIC 0 75000 150000 225000 300000 GENOMIC METAGENOMIC SYNTHETIC METATRANSCRIPTOMIC total 0 10000 20000 30000 40000 GENOMIC METAGENOMIC SYNTHETIC METATRANSCRIPTOMIC publication 80.6% 11.7% 5.0% 1.3% 0.9% 66,9% 25.5% 3.8% 2.9%

Slide 66

Slide 66 text

ొ࿥͞Εͨσʔλ͸࠶ར༻͞ΕΔ 0 1000 2000 3000 4000 total #PMID > 1 #SRAID 3059 204 id count title SRA008679 SRA030426 SRA024198 SRA008091 SRA000271 50 HapMap project 48 Human Prostate Cancer using Next Generation RNA Sequencing (human) 8 Metagenomic analysis of marine microbes isolated during the Global Ocean Sampling Expedition 8 Human 1000 genomes 7 HapMap project

Slide 67

Slide 67 text

࿦จͰ࢖ΘΕ͍ͯΔެ։σʔλΛޮ཰Α͘ݕࡧ͢Δ %#$-443".FUBEBUB4FBSDI IUUQTSBECDMTKQTFBSDI

Slide 68

Slide 68 text

࿦จͰ࢖ΘΕ͍ͯΔެ։σʔλΛޮ཰Α͘ݕࡧ͢Δ %#$-443".FUBEBUB4FBSDI IUUQTSBECDMTKQTFBSDI • ʮσʔλͷ࣭͸ղੳͰ͸Ͳ͏ʹ΋ͳΒͳ͍ʯͷ͸ಉ͡! • σʔλͷ࣭ͷ൑அʹ͸࣮ݧ৚݅ͳͲͷϝλ৘ใͷॆ࣮͕ඞཁ • େྔͷσʔλ͔Βޮ཰Α͘ඞཁͳσʔλΛ୳͞ͳͯ͘͸ͳΒͳ͍ • ʮ໨తͱ͢Δσʔλ͕Ͳͷ͘Β͍ొ࿥͞Ε͍ͯΔ͔ʯΛՄࢹԽ • αΠζͷେ͖ͳσʔλ͸DLɾల։ʹ͕͔͔࣌ؒΔ • ʮϋζϨʯΛҾ͖ͨ͘ͳ͍ • Ϧʔυ৘ใͷ௥Ճ (Ϧʔυ਺ɼϦʔυ௕ɼΤϥʔ཰ɼetc.) • ༧ΊΫΦϦςΟΛ֬ೝ͢Δ͜ͱͰQCॲཧΛলུ

Slide 69

Slide 69 text

ࢼ͠ʹԿ͔୳ͯ͠ΈΔ   TFBSDIRVFSZWJSVTPOIUUQTSBECDMTKQTFBSDI

Slide 70

Slide 70 text

ࢼ͠ʹԿ͔୳ͯ͠ΈΔ   ࿦จ෇͖ͷ໘നͦ͏ͳϓϩδΣΫτΛൃݟ

Slide 71

Slide 71 text

ࢼ͠ʹԿ͔୳ͯ͠ΈΔ   ݟ͚ͭͨϓϩδΣΫτͷ࿦จͷϦϯΫʹඈͿ

Slide 72

Slide 72 text

ࢼ͠ʹԿ͔୳ͯ͠ΈΔ   .BUFSJBMT.FUIPETͰγʔέϯγϯάͷهड़Λ୳͢

Slide 73

Slide 73 text

ࢼ͠ʹԿ͔୳ͯ͠ΈΔ   ݩ࿦จൃݟ

Slide 74

Slide 74 text

ࢼ͠ʹԿ͔୳ͯ͠ΈΔ   TFRVFODFʹؔ͢Δهड़Λ୳͢

Slide 75

Slide 75 text

ࢼ͠ʹԿ͔୳ͯ͠ΈΔ   σʔλॲཧʹ͍ͭͯͷهड़Λ୳͢

Slide 76

Slide 76 text

ࢼ͠ʹԿ͔୳ͯ͠ΈΔ   πʔϧͷ໊લͰ(PPHMFݕࡧ

Slide 77

Slide 77 text

ࢼ͠ʹԿ͔୳ͯ͠ΈΔ   #SPBE*OTUJUVUFͰΦʔϓϯɾιʔεͷιϑτ΢ΣΞΛൃݟ

Slide 78

Slide 78 text

ࢼ͠ʹԿ͔୳ͯ͠ΈΔ   ϓϩδΣΫτͷϖʔδʹ໭ͬͯσʔλΛݟΔ

Slide 79

Slide 79 text

ࢼ͠ʹԿ͔୳ͯ͠ΈΔ   3VO*%ΛΫϦοΫͯ͠Ϧʔυͷ৘ใ 'BTU2$ͷ݁Ռ ΛݟΔ

Slide 80

Slide 80 text

ࢼ͠ʹԿ͔୳ͯ͠ΈΔ   ໰୊͕ͳ͚Ε͹μ΢ϯϩʔυͷϦϯΫΛΫϦοΫͯ͠ϑΥʔϚοτΛબ୒ͯ͠μ΢ϯϩʔυ

Slide 81

Slide 81 text

ิ଍৘ใ • Broad Instituteʹ͍ͭͯ • Ϙετϯʹ͋ΔੈքͰ΋༗਺ͷڊେγʔέϯεڌ఺ • ଞͷγʔέϯεηϯλʔͱͯ͠Sanger Institute @ UK, BGI @ தࠃ ͳͲ • FastQCʹ͍ͭͯ • NGSσʔλͷQCιϑτ΢ΣΞͷ͏ͪ࠷΋ීٴ͍ͯ͠Δ΋ͷͷ1ͭ • SRA/SRA LiteϑΥʔϚοτʹ͍ͭͯ • ѹॖ͞ΕͨNGSσʔλ • .fastq΍γʔέϯαಠࣗͷੜσʔλϑΥʔϚοτʹల։͕Մೳ • SRA ToolkitΛར༻ͯ͠ѹॖ/ղౚΛߦ͏ • http://www.ncbi.nlm.nih.gov/Traces/sra/?view=software

Slide 82

Slide 82 text

ެ։σʔλΛར༻࣮ͯ͠ࡍͷݚڀͷྲྀΕΛ௥͍͔͚ͯΈΔ Standing on the shoulders of giants

Slide 83

Slide 83 text

/(4Λར༻ͨ͠ݚڀͷҰൠతͳྲྀΕ ࣗ෼ͰγʔέϯγϯάΛߦ͏৔߹ ࣮ݧσβΠϯ ༧උ࣮ݧ αϯϓϦϯά %/"ௐ੔ ϥΠϒϥϦ࡞੒ γʔέϯε 2$ ϑΟϧλϦϯά NBQQJOHBTTFNCMF 2$ ໨తผղੳ ֬ೝ࣮ݧ ࿦จࣥච σʔλެ։ ࿦จ౤ߘ ϦόΠζ࠶ղੳ ΞΫηϓτ ҿΈձ • ΍ͬͺΓҿΈձ͕ԕ͍

Slide 84

Slide 84 text

ެ։σʔλΛར༻ͨ͠৔߹ͷҰൠతͳྲྀΕ ݕࡧର৅ͷઃఆ ݕࡧ ඞཁͳϝλ৘ใͷऩू μ΢ϯϩʔυ 2$ ϑΟϧλϦϯά NBQQJOHBTTFNCMF 2$ ໨తผղੳ ֬ೝ࣮ݧ ࿦จࣥච σʔλެ։ ࿦จ౤ߘ ϦόΠζ࠶ղੳ ΞΫηϓτ ҿΈձ ࿦จ౳ؔ࿈৘ใͷݕࡧ • ެ։σʔλΛར༻͢Δ໨త! • ࣗΒͷ࣮ݧͷิ଍৘ใͱͯ͠ར༻͢Δ! • ৽نʹ্ཱͪ͛ΔݚڀͷαʔϕΠͱͯ͠σʔλղੳ·Ͱߦ͏! • σʔλղੳπʔϧ/ύΠϓϥΠϯΛߏங͢Δ! • ҿΈձ͕ͪΐͬͱ͚ͩۙ͘ͳΔ(͔΋)

Slide 85

Slide 85 text

ެ։σʔλΛ୳ͯ͠ΈΔ • ֤SRAϒϩʔΧʔͷ΢ΣϒαΠτ͔Β୳͢ • NCBI: http://www.ncbi.nlm.nih.gov/sra • EBI: http://www.ebi.ac.uk/ena/ • DDBJ: http://trace.ddbj.nig.ac.jp/DRASearch/ • DBCLS SRA͔Β୳͢ • http://sra.dbcls.jp/ • http://sra.dbcls.jp/search • GEO, ArrayExpress͔Β୳͢ • GEO: http://www.ncbi.nlm.nih.gov/geo/ • ArrayExpress: http://www.ebi.ac.uk/arrayexpress/ • PubMed͔Β୳͢ • αΠυόʔͷSRA/GEOͳͲͷϦϯΫ͔Β

Slide 86

Slide 86 text

ඞཁͳϝλ৘ใΛऩू͢Δ  • ͲͷΑ͏ʹαϯϓϦϯάɾલॲཧΛߦ͔ͬͨʁ • ͲͷΑ͏ʹϥΠϒϥϦௐ੔Λߦ͔ͬͨʁ • Ͱ͖Ε͹ࢼༀͱͦͷόʔδϣϯ΋ • ࢖͍ͬͯΔγʔέϯα͸ʁ • Ͱ͖Ε͹όʔδϣϯ΋ • Ϧʔυ਺͸ʁ • single? paired? • pairedͳΒinsert௕͸ʁ • Ϧʔυ௕͸ʁ

Slide 87

Slide 87 text

ඞཁͳϝλ৘ใΛऩू͢Δ  • λά͸ʁ • multiplex? • όʔίʔσΟϯάʁ • ෳ਺Runͷѻ͍͸ʁ • replicates? • ϥΠϒϥϦΛ෼ׂͯ͠γʔέϯεʁ • ͲͷΑ͏ͳσʔλॲཧΛߦ͔ͬͨʁ • mapping? assemble? • Ͳͷιϑτ΢ΣΞ/πʔϧΛ࢖͔ͬͨʁ • ࢖ͬͨϦϑΝϨϯεήϊϜͷόʔδϣϯ͸ʁ • ͲͷΑ͏ͳσʔλղੳΛߦ͔ͬͨʁ • ࢖ͬͨιϑτ΢ΣΞɾπʔϧɾύΠϓϥΠϯ͸ʁ

Slide 88

Slide 88 text

σʔλΛμ΢ϯϩʔυ͢Δ • NCBI, EBI, DDBJͷͲ͔͜Βμ΢ϯϩʔυͯ͠΋ಉ͡ • Πϯλʔωοτճઢͷ௨৴଎౓Ͱબͼ·͠ΐ͏ • ϑΝΠϧͷେ͖͞͸਺ඦϝΨόΠτʙ਺ςϥόΠτ·Ͱ • ࣄલʹϑΝΠϧͷαΠζΛνΣοΫ͠·͠ΐ͏ • ৭ʑͱ޻෉ͯ͠μ΢ϯϩʔυΛศརʹͰ͖·͢ • μ΢ϯϩʔυ༻PCιϑτΛ࢖͏ • NCBI, DDBJ͕ఏڙ͢ΔAspera ConnectΛར༻͢Δ • Linux/Unix(MacͷTerminalΛؚΉ)ͰlftpͳͲͷίϚϯυΛ࢖͏

Slide 89

Slide 89 text

σʔλॲཧσʔλղੳΛߦ͏ • σʔλॲཧ(mapping/assemble) • ๲େͳܭࢉࢿݯΛཁ͢Δ • ਺ΪΨόΠτͷήϊϜͷ৽نΞηϯϒϧʹ͸20TB΋ͷϝϞϦΛ࢖͏͜ͱ΋ • σʔλॲཧʹ਺೔ʙ਺ि͔͔ؒΔ͜ͱ΋ • σʔλղੳ • ίϚϯυΛଧͭλΠϓͷΦʔϓϯιʔεπʔϧ (Linux, Mac) • PCιϑτ΢ΣΞ • ϒϥ΢βͰಈ͘ΦϯϥΠϯɾΞϓϦέʔγϣϯ • σʔλॲཧ/ղੳΛ྆ํͰ͖ΔαʔϏεɾιϑτ΢ΣΞ • DDBJ Read Annotation Pipeline (http://p.ddbj.nig.ac.jp) • CLC Bio Genomics Workbench (http://www.clcbio.co.jp)

Slide 90

Slide 90 text

%%#+3FBE"OOPUBUJPO1JQFMJOF QEECKOJHBDKQ

Slide 91

Slide 91 text

%%#+3FBE"OOPUBUJPO1JQFMJOF QEECKOJHBDKQ

Slide 92

Slide 92 text

%%#+3FBE"OOPUBUJPO1JQFMJOF QEECKOJHBDKQ

Slide 93

Slide 93 text

l3ͰϚΠΫϩΞϨΠzʹೃછΈͷ͋Δํ͸ͪ͜Β౦େɾ໳ాઌੜ 3ͰԘج഑ྻղੳɺͰݕࡧ 

Slide 94

Slide 94 text

3͕޷͖Ͱ޷͖Ͱͨ·Βͳ͍ํ͸ͪ͜Β ཧݚɾೋ֊ಊ͞Μ  DBUIBDLJOHJTCFMJFWJOHPSHMFDUVSF

Slide 95

Slide 95 text

ࠔͬͨͱ͖͸ • ΠϯλʔωοτͰݕࡧ͢Δ • 90෼ؤுͬͯ΋ݟ͚ͭΒΕͳ͔ͬͨΒਓʹฉ͘

Slide 96

Slide 96 text

ࠔͬͨͱ͖͸ • ΠϯλʔωοτͰݕࡧ͢Δ • Google • SeqAnswers • seqanswers.com • BioStars • www.biostars.org • NGS Surfer’s Wiki • cell-innovation.nig.ac.jp/wiki/ • Sequence Read Archive User Reference • github.com/inutano/sra_metadata_toolkit/wiki

Slide 97

Slide 97 text

ࠔͬͨ࣌͸ͱΓ͋͑ͣ͜͜Λݕࡧ4&2BOTXFST IUUQTFRBOTXFSTDPN

Slide 98

Slide 98 text

ಈըͰͷνϡʔτϦΞϧ$-$(FOPNJD8PSLCFODIͷ࢖͍ํ UPHPUWECDMTKQ

Slide 99

Slide 99 text

ࠔͬͨͱ͖͸ • 90෼ؤுͬͯ΋ݟ͚ͭΒΕͳ͔ͬͨΒਓʹฉ͘ • NGSݱ৔ͷձ ϝʔϦϯάϦετ • BioStars • ϥΠϑαΠΤϯεQA • http://qa.lifesciencedb.jp • twitter

Slide 100

Slide 100 text

೔ຊޠ2"ͳΒ͜͜ϥΠϑαΠΤϯε2" IUUQRBMJGFTDJFODFECKQ

Slide 101

Slide 101 text

4VNNBSZ ͓͔ͭΕ͞·Ͱͨ͠

Slide 102

Slide 102 text

l4FRVFODJOHJT'3&& TPMFU`TTFRVFODFFWFSZUIJOHz • NGSΛར༻ͨ͠ݚڀ͸ҰےೄͰ͸͍͔ͳ͍ • ͔͜͠͠Ε·Ͱʹ͸ಘΒΕͳ͔ͬͨݱ৅͕؍࡯Ͱ͖Δ • ຊ౰ʹNGSΛ࢖͏΂͖͔ʁNGSͰԿΛݟΔͷ͔ʁͷσβΠϯ͕େࣄ • ෼͔Βͳ͍͜ͱ͸ΠϯλʔωοτͰݕࡧ͢Δ͔ਓʹฉ͚͹ղܾ͠·͢ • “ޙ໭ΓͰ͖ͳ͍”ϙΠϯτʹಥͬࠐΉલʹ໰୊Λղܾ͓ͯ͘͠