Sequence Read Archive: Database for High-throughput sequencing best practice 2013

Sequence Read Archive: Database for High-throughput sequencing best practice 2013

統合データベース講習会 AJACS富山「次世代シーケンスデータベース Sequence Read Archive を利用する」

991f3366d9cc17386e6a66ef4abc6dbc?s=128

Tazro Inutano Ohta

August 30, 2013
Tweet

Transcript

  1. ࣍ੈ୅γʔέϯεσʔλϕʔε 4FRVFODF3FBE"SDIJWFΛར༻͢Δ AJACS#42 TOYAMA ౷߹σʔλϕʔεߨशձ"+"$4෋ࢁ 0

  2. ϥΠϑαΠΤϯε౷߹σʔλϕʔεηϯλʔಛ೚ٕज़ઐ໳һେాୡ࿠ 5B[SP0IUB 5FDI4QFDJBMJTU %BUBCBTF$FOUFSGPS-JGF4DJFODF Effective SRA - public database for

    high-throughput sequencing
  3. ͓͜ͱΘΓ Preface A

  4. /(4σʔλղੳͷνϡʔτϦΞϧ͸͋Γ·ͤΜ %#΁ͷσʔλొ࿥ͷνϡʔτϦΞϧ͸͋Γ·ͤΜ /(4ݚڀΛαϙʔτ͢Δެڞσʔλͷ͓࿩Ͱ͢ Preface σʔλղੳΛαϙʔτ͢ΔϦιʔε͸͝঺հ͠·͢ σʔλొ࿥ͷࡍʹඞཁͳ৘ใ͸͝঺հ͠·͢ ݚڀͷݱ৔Ͱ׆͔ͨ͢Ίͷެڞ%#ͷ࢖͍ํΛ͝঺հ͠·͢ A A D

  5. ຊ೔ͷ಺༰ 0 Table of Contents

  6. 4FRVFODF3FBE"SDIJWF43"ʹ͍ͭͯ %#$-4ʹ͓͚ΔऔΓ૊Έ /(4σʔλݕࡧͱར༻ͷ࣮ྫ Table of Contents ӡӦମ੍ɼϙϦγʔɼެ։͞ΕΔσʔλ ଞ%#ͱͷ౷߹ɼݕࡧػೳͷ։ൃɼ౷ܭʹΑΔ%#ͷݱঢ়ͷՄࢹԽ ঺հͨ͠αʔϏεΛར༻ͯ͠ɼաڈͷ/(4ݚڀࣄྫΛௐࠪ͢Δ n

    E X
  7. ެڞ/(4σʔλϕʔε4FRVFODF3FBE"SDIJWF 43" ʹ͍ͭͯ ӡӦମ੍ɼϙϦγʔɼެ։͞ΕΔσʔλ SRA: The public DB for primary

    NGS data n
  8. ӡӦମ੍ ϙϦγʔ ެ։͞ΕΔσʔλ About SRA /$#*43" &#*&/" %%#+%3"͔ΒͳΔ*/4%$ʹΑͬͯ ڠಉӡӦɽہͷͲ͔͜ΒͰ΋ొ࿥ɾσʔλΞΫηε͕Մೳɽ ഑ྻσʔλͱ࣮ݧɼαϯϓϧͳͲͷৄࡉΛهड़ͨ͠ϝλσʔλɽ

    ഑ྻ͸H[C[΋͘͠͸ಠࣗܗࣜͰѹॖ͞Εͨ΋ͷ͕%-Մೳɽ C L J ௒ฒྻܕγʔέϯα͔ΒಘΒΕΔҰ࣍഑ྻσʔλΛड෇ɾެ։ɽ QFSTPOBMMZJEFOUJGJBCMFͳσʔλ͸ผ%# EC(B1 &(" ʹొ࿥ɽ
  9. INSDC: International Nucleotide Sequence Database Collaboration http://www.insdc.org ήϊϜ৘ใͷඪ४ԽͳͲͷ৘ใ %#DPMMBCͷϙϦγʔ

  10. The INSDC Members /$#*43" &.#-&#*&/"43" %%#+%3" ࠷΋ड෇σʔλ਺͕ଟ͍ɽ ಠࣗѹॖϑΥʔϚοτ43"ϑΥʔϚοτͷ։ൃݩɽ ࡾౡͷҨ఻ֶݚڀॴ಺ʹ͋Δ%%#+ʹΑͬͯӡ༻͞Ε͍ͯΔɽ ඇѹॖܗࣜͰ΋σʔλΛެ։͍ͯ͠Δɽ

    ैདྷͷ഑ྻΞʔΧΠϒͱಉ͡ηΫγϣϯ͕։ൃɾӡ༻͍ͯ͠Δɽ ొ࿥ɾݕࡧڞʹ(6*$6*྆ํΛ૝ఆͯ͠։ൃ͞Ε͍ͯΔɽ
  11. The INSDC Members /$#*43" &.#-&#*&/"43" %%#+%3" ࠷΋ड෇σʔλ਺͕ଟ͍ɽ ಠࣗѹॖϑΥʔϚοτ43"ϑΥʔϚοτͷ։ൃݩɽ ࡾౡͷҨ఻ֶݚڀॴ಺ʹ͋Δ%%#+ʹΑͬͯӡ༻͞Ε͍ͯΔɽ ඇѹॖܗࣜͰ΋σʔλΛެ։͍ͯ͠Δɽ

    ैདྷͷ഑ྻΞʔΧΠϒͱಉ͡ηΫγϣϯ͕։ൃɾӡ༻͍ͯ͠Δɽ ొ࿥ɾݕࡧڞʹ(6*$6*྆ํΛ૝ఆͯ͠։ൃ͞Ε͍ͯΔɽ
  12. NCBI SRA http://www.ncbi.nlm.nih.gov/sra ϑϦʔϫʔυݕࡧ ৄࡉݕࡧ 43"#-"45 ϩϯάϦʔυͷΈ 4PGUXBSF 43"UPPMLJU

  13. NCBI SRA #JPQSPKFDUͷώοτ ώοτ݅਺ &YQFSJNFOU୯Ґ λΠτϧɼγʔέϯαɼ γʔέϯεྔͳͲͷ৘ใ ώοτͨ͠ੜ෺छ ΩʔϫʔυlIVNBONJDSPCJPNFQSPKFDUzͷݕࡧ݁Ռ

  14. NCBI SRA ࣮ݧͷλΠτϧ MBZPVU BEBQUPSͳͲͷ Ϧʔυͷ৘ใ γʔέϯγϯάϥϯ͝ͱͷ ৘ใͱ%-ϦϯΫ ݕࡧ݁Ռͷτοϓώοτ 439

  15. The INSDC Members /$#*43" &.#-&#*&/"43" %%#+%3" ࠷΋ड෇σʔλ਺͕ଟ͍ɽ ಠࣗѹॖϑΥʔϚοτ43"ϑΥʔϚοτͷ։ൃݩɽ ࡾౡͷҨ఻ֶݚڀॴ಺ʹ͋Δ%%#+ʹΑͬͯӡ༻͞Ε͍ͯΔɽ ඇѹॖܗࣜͰ΋σʔλΛެ։͍ͯ͠Δɽ

    ैདྷͷ഑ྻΞʔΧΠϒͱಉ͡ηΫγϣϯ͕։ൃɾӡ༻͍ͯ͠Δɽ ొ࿥ɾݕࡧڞʹ(6*$6*྆ํΛ૝ఆͯ͠։ൃ͞Ε͍ͯΔɽ
  16. EMBL-EBI ENA http://www.ebi.ac.uk/ena ৄࡉݕࡧ ϑϦʔϫʔυݕࡧ ഑ྻݕࡧ

  17. EMBL-EBI ENA ֤ΧςΰϦʹ͓͚Δݕࡧ݁Ռͷ ώοτ਺ ֤ΧςΰϦͷτοϓώοτ ΩʔϫʔυlIVNBONJDSPCJPNFQSPKFDUzͷݕࡧ݁Ռ

  18. EMBL-EBI ENA ࣮ݧʹ͍ͭͯͷ৘ใ γʔέϯγϯάϥϯͷ৘ใɼ %-ϦϯΫ z&YQFSJNFOUzͷτοϓώοτ 439 Ұׅ%-ɼςΩετܗࣜͰͷ දࣔɼΧϥϜͷબ୒ දࣔ͞Ε͍ͯΔ৘ใΛ

    ςΩετܗࣜͰ%-
  19. The INSDC Members /$#*43" &.#-&#*&/"43" %%#+%3" ࠷΋ड෇σʔλ਺͕ଟ͍ɽ ಠࣗѹॖϑΥʔϚοτ43"ϑΥʔϚοτͷ։ൃݩɽ ࡾౡͷҨ఻ֶݚڀॴ಺ʹ͋Δ%%#+ʹΑͬͯӡ༻͞Ε͍ͯΔɽ ඇѹॖܗࣜͰ΋σʔλΛެ։͍ͯ͠Δɽ

    ैདྷͷ഑ྻΞʔΧΠϒͱಉ͡ηΫγϣϯ͕։ൃɾӡ༻͍ͯ͠Δɽ ొ࿥ɾݕࡧڞʹ(6*$6*྆ํΛ૝ఆͯ͠։ൃ͞Ε͍ͯΔɽ
  20. DDBJ DRA αΠτ಺ݕࡧ σʔλͷݕࡧɼ σʔλͷొ࿥ɼ ಈըϚχϡΞϧ http://trace.ddbj.ac.jp/dra

  21. DDBJ DRA *%ʹΑΔݕࡧɼ ϑΝηοτ ߜࠐ ݕࡧɼ ΩʔϫʔυʹΑΔݕࡧ ੜ෺छɼ࣮ݧछɼσʔλొ࿥ݩͷ ϥϯΩϯάUPQ http://trace.ddbj.ac.jp/DRASearch

    *%छ͝ͱͷΤϯτϦ਺
  22. DDBJ DRA ૯ώοτ਺ ݕࡧ݁Ռ http://trace.ddbj.ac.jp/DRASearch ϝλσʔλͷλΠϓͱ ੜ෺छʹΑΔߜΓࠐΈ

  23. DDBJ DRA ؔ࿈ΞΠςϜ΁ͷϦϯΫͱ %-ϦϯΫ ࣮ݧͷৄࡉ৘ใ ϥΠϒϥϦ࡞੡ɼ γʔέϯαɼ ϕʔείʔϧͳͲͷ৘ใ z&YQFSJNFOUzͰߜΓࠐΈˠτοϓώοτ 439

  24. DDBJ DRA /BWJHBUJPOˠ3VO 433  %-Ͱ͖ͳ͍΋ͷͷྫ

  25. DDBJ DRA %3"4FBSDIˠ%33 %-͕Մೳͳ΋ͷͷྫ Ϧʔυͷ৘ใ RVBMJUZʹνΣοΫ͢Δͱ QISFETDPSF͕දࣔ͞ΕΔ 'BTURܗࣜͱ 43"-JUFܗࣜɼ ͦΕͧΕͷ%-ϦϯΫ

    '51
  26. DDBJ DRA /$#*43"Ͱz433zΛݕࡧˠ3FDPSEJTSFNPWFE

  27. Handson ݕࡧͯ͠ΈΔ ੜ෺छɼγʔέϯαʔ໊ɼҨ఻ࢠ໊ɼ࣬ױ໊ͳͲͰݕࡧɽ "EWBODFEৄࡉݕࡧ΋࢖ͬͯΈΔɽ c ग़͖ͯͨσʔλͷৄࡉΛௐ΂Δ σʔλ͕ͲΕ͘Β͍ͷେ͖͔͞ௐ΂Δ μ΢ϯϩʔυʹͲΕ͘Β͍͕͔͔࣌ؒΓͦ͏͔ʁ ϋʔυσΟεΫͷۭ͖༰ྔʹऩ·Δ͔ʁ ώοτͨ݅͠਺͕ଟ͗͢Δগͳ͗͢Δ࣌͸ผͷݕࡧΛࢼ͢ɽ

    ໘നͦ͏ͳσʔλ͔Ͳ͏͔൑அͰ͖Δ৘ใΛ୳͢ɽ
  28. Search Tips ͦΕͧΕ͕ಠࣗʹػೳΛ։ൃ͍ͯ͠Δ ࣮ߦͰ͖Δݕࡧͷछྨɼ݁ՌͷදࣔͳͲ͕ҟͳΔɽ *%͸ڞ௨ͳͷͰɼ࢖͍෼͚Δ͜ͱͰΑΓศརʹ୳ͤΔɽ O %-Ͱ͖ͳ͍σʔλ΋͋Δ ϝλσʔλʹهड़͞Εͳ͍৘ใ͸ݕࡧͰ͖ͳ͍ ϝλσʔλͱ͸ɼ഑ྻσʔλʹର͢Δ஫ऍσʔλͷ͜ͱɽ ࢦఆͷܗࣜʹै͍ొ࿥ऀʹΑͬͯهड़͞ΕΔɽ

    ༷ʑͳཧ༝Ͱొ࿥ऀʹΑͬͯऔΓԼ͛ΒΕΔͳͲͷଞʹɼ ొ࿥͞Εͨ͹͔ΓͰڞ༗͞Ε͍ͯͳ͍ͨΊݟ͔ͭΒͳ͍͜ͱ΋ɽ
  29. ! ϝλσʔλ Metadata Object

  30. Metadata Object Dependencies Submission Analysis Study Sample Sample Experiment Experiment

    Run Run Run Run ഑ྻσʔλͱڞʹొ࿥͞ΕΔϝλσʔλ͸छྨͷΦϒδΣΫτ͔Β ߏ੒͞ΕɼΦϒδΣΫτͷछྨʹԠͯ͡৘ใ͕هड़͞ΕΔ
  31. Metadata Object Dependencies Submission Analysis Study Sample Sample Experiment Experiment

    Run Run Run Run ϝλσʔλΛొ࿥୯ҐͰ·ͱΊΔ4VCNJTTJPOΛআ͘ͱɼ جຊతͳϝλσʔλͷηοτͷؔ܎ੑ͸͜ͷΑ͏ʹͳΔ
  32. Metadata Object Dependencies DRA000001 DRZ000001 DRP000001 DRS000001 DRS000001 DRX000001 DRX000002

    DRR000004 DRR000003 DRR000002 DRR000001 ͦΕͧΕͷΦϒδΣΫτ͸ಠࣗͷ*%Λ͍࣋ͬͯΔɽ *%͸σʔλΛड͚෇͚ͨ%#ͱΦϒδΣΫτͷछྨΛࣔ͢ ӳࣈࣈʹଓܻ͘ͷ਺ࣈͰࣔ͞ΕΔ
  33. Metadata Tips ΦϒδΣΫτɼ*%ͷؔ܎͸ෳࡶ େن໛ͳϓϩδΣΫτʹͳΔͱ3VO΍4BNQMF͕਺ඦʹ΋ͳΔɽ ·ͨɼ༷ʑͳཧ༝Ͱ ྫ֎తʹ ϧʔϧ͔Β֎Ε͍ͯΔ΋ͷ΋͋Δɽ O lͲͷ৘ใ͕Ͳ͜ʹهड़͞ΕΔ͔zΛ೺Ѳ͢Δ ొ࿥ऀʹΑͬͯϝλσʔλͷهड़ʹ͕ࠩ͋Δ

    ಛʹϥΠϒϥϦௐ੔ͷ߲ͳͲɽ ࿦จͳͲͷ৘ใ͕ߋ৽͞Εͯ΋Ξοϓσʔτ͞Εͳ͍৔߹΋ɽ σʔλͷొ࿥͚ͩͰͳ͘ɼݕࡧ͢Δࡍʹ΋ॏཁɽ ৄ͘͠͸IUUQUSBDFEECKBDKQESBNFUBEBUBIUNM
  34. ·ͱΊ n Summary #1

  35. Summary #1 43"͸*/4%$ϝϯόʔہʹΑͬͯӡӦ͞ΕΔ σʔλ͸ڞ༗͞ΕΔͷͰೖΓޱ͕Ͳ͜Ͱ΋ಉ͕ͩ͡ɼ ݕࡧػೳͳͲ͕ͦΕͧΕҟͳΔɽ ഑ྻσʔλͷొ࿥ɾݕࡧʹ͸ϝλσʔλ͕ॏཁ ͦΕͧΕʹ*%͕ৼΒΕొ࿥ऀ͕هड़͢Δɽ ಺༰ͱؔ܎͸ಛʹσʔλొ࿥࣌ʹ͸ཧղ͢Δඞཁ͕͋Δɽ n

  36. E %#$-4ʹ͓͚ΔऔΓ૊Έ ଞ%#ͱͷ౷߹ɼݕࡧػೳͷ։ൃɼ౷ܭʹΑΔ%#ͷݱঢ়ͷՄࢹԽ Tech Dev at DBCLS - Search and

    Statistics
  37. ଞ%#ͱͷ౷߹ ݕࡧػೳͷ։ൃ ౷ܭʹΑΔ%#ͷݱঢ়೺Ѳ DBCLS v SRA ϝλσʔλ͚ͩͰͳ͘ɼ࿦จͳͲͷจݙ৘ใ΍ɼ ࣬ױͷ৘ใɼ͞Βʹݸผσʔλͷ഑ྻΫΦϦςΟΛܭࢉɽ ہͷػೳΛ౷߹ͭͭ͠ɼಠࣗͷػೳΛ௥Ճͨ͠ɼ ΑΓσʔλར༻ऀΛࢦ޲ͨ͠ݕࡧػೳΛ։ൃɽ

    ϝλσʔλΛݩʹͨ͠ొ࿥਺ͷਪҠΛެ։ɽ ͞Βʹ഑ྻ৘ใΛݩʹͨ͠%#શମͷ৘ใΛ෼ੳɽ ≠ π ¥
  38. %#$-443" ≠ DBCLS SRA

  39. DBCLS SRA http://sra.dbcls.jp/ ొ࿥͞Ε͍ͯΔσʔλΛ ϝλσʔλผʹϦετදࣔ 43"*%΍ੜ෺छɼ γʔέϯαͳͲ͔Βݕࡧ

  40. http://sra.dbcls.jp/ ࣮ݧछɼγʔέϯαɼ ੜ෺छ͝ͱͷϥϯΩϯά ೔෇ʹΑΔਪҠͷάϥϑ DBCLS SRA

  41. http://sra.dbcls.jp/ σʔλΛจݙ৘ใ͔Β୳͢ σʔλΛ࣬ױ৘ใ͔Β୳͢ DBCLS SRA

  42. จݙ৘ใͷ౷߹ ∆ DBCLS SRA Publication Search

  43. ഑ྻσʔλͷ৘ใ͸࿦จͷํ͕ৄ͍͠ ഑ྻσʔλ͕࿦จΑΓલʹެ։͞ΕΔ͜ͱ΋ ϝλσʔλʹจݙ৘ใ͕௥ه͞Εͳ͍͜ͱ͕͋Δ DBCLS SRA Publication Search ݚڀͷதͰͷγʔέϯεͷҐஔ͚ͮ΋ॏཁɽ .BUFSJBMT.FUIPETʹৄ͍͠৘ใ͕͋Δ͜ͱ͕ଟ͍ɽ άϥϯτͷ੍໿ɼδϟʔφϧʹΑΔσʔλެ։ͷࢦࣔͳͲɽ

    େن໛ͳϓϩδΣΫτͰ͸ެ։ϙϦγʔΛઃఆ͢Δ͜ͱ΋ɽ Ұ౓ొ࿥͞Εͨޙʹϝλσʔλ͕Ξοϓσʔτ͞Εͳ͍໰୊ɽ ެ։͞Εͨσʔλͱ࿦จͷඥ෇͚Λߦ͏ඞཁ͕͋Δɽ ∆
  44. %#$-443"ˠzจݙ͔Β୳͢z ࣮ݧछɼγʔέϯαɼ ੜ෺छʹΑΔߜΓࠐΈݕࡧ 43"*%ͱ1VC.FE*%ͷର Ԡද͓Αͼจݙͷ৘ใ ΧϥϜ໊ΛΫϦοΫͯ͠ ฒ΂ସ͑ DBCLS SRA Publication

    Search
  45. ࣬ױ৘ใͷ౷߹ ® DBCLS SRA Diseases Search

  46. ΫϦχΧϧγʔέϯεͷݕࡧ͸ࠔ೉ ϝλσʔλͷهड़͚ͩͰ͸ෆे෼ͳ৔߹΋ จݙ৘ใʹ෇༩͞ΕͨλάΛར༻͢Δ DBCLS SRA Diseases Search શήϊϜγʔέϯε΍ଟܕͷ৘ใͳͲ͸ɼ 43"Ͱ͸ެ։͞Εͳ͍৔߹΋ଟ͍ɽ ొ࿥ऀʹΑͬͯهड़ͷ࢓ํɼ৘ใྔʹ͕ࠩ͋ΔͨΊɼ

    Ұׅͯ͠ݕࡧ͢Δ͜ͱ͕೉͍͠ɽ 1VC.FEΤϯτϦʹ෇༩͞ΕΔ.F4)λʔϜΛར༻ͯ͠ɼ ࣬ױͷ৘ใΛΩʔʹͨ͠σʔλݕࡧػೳΛ։ൃɽ ®
  47. %#$-443"ˠz࣬ױ͔ΒோΊΔzˠස౓ผ ࣬ױλΠϓ͔Βݕࡧ ࣬ױ໊ͱొ࿥σʔλ਺ දࣔ݅਺ͷࢦఆ DBCLS SRA Diseases Search

  48. %#$-443"ˠz࣬ױ͔ΒோΊΔzˠ࣬ױΧςΰϦผ ΫϦοΫͯ͠πϦʔΛల։ ਺ࣈΛΫϦοΫͯ͠ Ϧετදࣔ DBCLS SRA Diseases Search

  49. %#$-4ಠࣗͷݕࡧػೳ S DBCLS SRA Metadata Search

  50. S ΑΓϢʔβࢦ޲ͷݕࡧػೳΛఏڙ͢Δ ΑΓଟ͘ͷ৘ใΛݕࡧʹ൓өͤ͞Δ ࣗಈԽʹରԠ͢Δ DBCLS SRA Metadata Search ϝλσʔλ*%ʹΑΔ؅ཧͳͲɼϢʔβʹͱͬͯ ֮͑ͳ͚Ε͹͍͚ͳ͍஌ࣝΛͳΔ΂͘ݮΒ͢ɽ

    ϝλσʔλ͚ͩͰͳ͘ɼ౷߹͞Εͨଞ%#ͷ৘ใ΍ ಠࣗͷ৘ใΛऔΓೖΕͨॊೈͳݕࡧػೳΛ։ൃɽ खಈͰݕࡧΛ܁Γฦ͢ͷ͸ޮ཰͕ѱ͍ɽ ࣗಈԽͰղੳύΠϓϥΠϯ΁ͷ૊ΈࠐΈ΋Մೳʹɽ
  51. ։ൃऀ޲͚৘ใɼαϙʔτ༻ πΠολʔΞΧ΢ϯτ ϑϦʔϫʔυݕࡧ http://sra.dbcls.jp/search DBCLS SRA Metadata Search ߜΓࠐΈݕࡧ

  52. ৚݅ʹ֘౰͢Δσʔλʹ ରͯ͠ϑϦʔϫʔυݕࡧ ৚݅ʹ֘౰͢Δ σʔλΛશͯදࣔ DBCLS SRA Metadata Search ֤৚݅ʹ֘౰͢Δ σʔλͷׂ߹

    ߜΓࠐΈݕࡧ .VTNVTDVMVT5SBOTDSJQUPNF*MMVNJOB.J4FR
  53. ΧϥϜ໊ΛΫϦοΫͯ͠ ฒ΂ସ͑ ΩʔϫʔυͰΞΠςϜΛ ߜΓࠐΉ DBCLS SRA Metadata Search ώοτͨ͠σʔλͷ৘ใɽ ੨͍ߦ͸࿦จ৘ใ෇͖

    ݕࡧ݁Ռ
  54. ϓϩδΣΫτͷ֓ཁ ࿦จͷ֓ཁͱཁࢫ DBCLS SRA Metadata Search 1VC.FE 1.$΁ͷϦϯΫ 431ΛΫϦοΫͨ݁͠Ռ

  55. DBCLS SRA Metadata Search 431ΛΫϦοΫͨ݁͠Ռ ΫϦοΫͯ͠ల։ .BUFSJBMTBOE.FUIPET 3FTVMUT

  56. DBCLS SRA Metadata Search 431ΛΫϦοΫͨ݁͠Ռ ςʔϒϧΛUTW KTPOܗࣜͰදࣔ ฒ΂ସ͑ͱߜΓࠐΈ μ΢ϯϩʔυϦϯΫ 3VO

    4BNQMFͷ৘ใ
  57. DBCLS SRA Metadata Search 431ΛΫϦοΫͨ݁͠Ռ ςʔϒϧΛUTW KTPOܗࣜͰදࣔ ฒ΂ସ͑ͱߜΓࠐΈ શମͰͷࠩ෼ΛϋΠϥΠτ

  58. DBCLS SRA Metadata Search 3VOͷΫΦϦςΟ৘ใ 433 Ϧʔυ਺ɼϦʔυ௕ɼ ($ͳͲͷ৘ใ ֤Ϟδϡʔϧͷ݁ՌΛ ΫϦοΫ֦ͯ͠େ

  59. S ݕࡧΛՄࢹԽ͢Δ ࿦จ৘ใ΋ؚΊͨΩʔϫʔυݕࡧ Ϧʔυͷ৘ใΛ%-લʹ֬ೝ͢Δ DBCLS SRA Metadata Search ͳͥݕࡧ݁Ռ͕ଟ͍গͳ͍ͷ͔ɼ શମʹ͓͚Δׂ߹Λݟͯ൑அͰ͖Δɽ

    ͳΔ΂͘ଟ͘ͷؔ࿈͢Δσʔλ͕ݕࡧͰ ώοτ͢ΔΑ͏ʹݕࡧର৅Λ֦େ͍ͯ͠Δɽ μ΢ϯϩʔυʹ͸௕͍࣌ؒΛཁ͢Δ͜ͱ΋ɽ ࣮֬ʹ࢖͑Δσʔλ͚ͩΛબͿͨΊͷ৘ใΛఏڙɽ
  60. ·ͱΊ E Summary #2

  61. Summary #2 %#$-443"͸43"ͷػೳ֦ுͰ͋Δ σʔλొ࿥͸ड͚෇͚ͣɼ43"ͷঢ়گΛ೺Ѳ͢ΔͨΊͷ৘ใ΍ ΑΓσʔλΛ୳͠΍͍͢ݕࡧػೳΛఏڙ͍ͯ͠Δɽ ഑ྻσʔλͷొ࿥ɾݕࡧʹ͸ϝλσʔλ͕ॏཁ ͦΕͧΕʹ*%͕ৼΒΕొ࿥ऀ͕هड़͢Δɽ ಺༰ͱؔ܎ੑ͸ಛʹσʔλొ࿥࣌ʹ͸ཧղ͢Δඞཁ͕͋Δɽ E

  62. X /(4σʔλݕࡧͱར༻ͷ࣮ྫ ঺հͨ͠αʔϏεΛར༻ͯ͠ɼաڈͷ/(4ݚڀࣄྫΛௐࠪ͢Δ Search published NGS data and project

  63. X ஈ֊ผɾެڞσʔλͷར༻ྫ Use cases of Public data

  64. X ͜Ε͔Βߦ͏γʔέϯε ࠓਐߦ͍ͯ͠Δγʔέϯε ׬ྃͨ͠γʔέϯε Use cases of Public data ྨࣅϓϩδΣΫτͱσʔλͷ৘ใΛݩʹɼ

    γʔέϯγϯάͷσβΠϯͱγϛϡϨʔγϣϯΛߦ͏ɽ ಉ͡ੜ෺छɾγʔέϯαͷσʔλΛݩʹɼ γʔέϯγϯάͷΫΦϦςΟͷධՁΛߦ͏ɽ ۙԑछɾྨࣅϓϩδΣΫτͷσʔλΛ௥Ճͯ͠ɼ ղੳͷਫ਼౓޲্ʹ໾ཱͯΔɽ
  65. σʔλͷݕࡧɾར༻Ͱඞཁͳ͜ͱ ∑ Practical search tips

  66. ∑ ࣮ݧछ͕ٻΊΔϦʔυεϖοΫΛ೺Ѳ͢Δ γʔέϯαͷϦʔυεϖοΫΛ೺Ѳ͢Δ Practical search tips ήϊϜϦγʔέϯεɼ3/"4FR $I*14FRͳͲɼ ࣮ݧͷछྨʹΑͬͯඞཁͳϦʔυ௕ɼϦʔυ਺͸ҟͳΔɽ ͦΕͧΕͷγʔέϯα͔ΒಘΒΕΔϦʔυͷεϖοΫ͸ɼ

    ࢼༀͷΞοϓσʔτ౳ʹΑͬͯ΋มΘΔͷͰ஫ҙ͕ඞཁɽ ੜ෺छͱ࣮ݧछʹԠͯ͡γʔέϯαΛબ୒͢Δ ެڞ%#͔ΒྨࣅͷϓϩδΣΫτΛݕࡧ͢ΔͨΊʹ΋ɼ ήϊϜαΠζͱ࣮ݧछʹԠͨ͡Ϧʔυͷ৘ใ͕ॏཁ
  67. Required read spec by application application / ࣮ݧछ total bases

    / ૯Ԙج਺ read length / Ϧʔυ௕ read number (M) / Ϧʔυ਺ ώτήϊϜϦγʔέϯε 90-150Gb 2x100 900-1500 λʔήοτϦγʔέϯε <1Gb 2x100 10 exome sequence 5~7Gb 2x100 70 RNA-Seq 5Gb 2x100 50 TSS-Seq 1Gb 1x50 20 small RNA 0.35Gb 1x35 >10 ඍੜ෺ήϊϜ >150Mb 2x100 >1.5 ਅ֩ੜ෺ήϊϜ >4Gb 2x100 >40 Bisulfite-Seq 90-150Gb 2x100 900-1500 ChIP-Seq >6Gb 1x100 60 ࡉ๔޻ֶผ࡭࣍ੈ୅γʔέϯαʔ໨తผΞυόϯετϝιουQΑΓҾ༻ ஫ର৅ͷήϊϜαΠζͳͲͰ਺ࣈ͕มΘΔ͜ͱ͕͋Γ·͢ɽ·ͨɼطʹ৘ใ͕ݹ͘ͳ͍ͬͯΔՄೳੑ΋͋Γ·͢
  68. Required read spec by application ४උಋೖ ώτήϊϜղੳ Ҩ఻ࢠൃݱ੍ޚղੳ ৽نήϊϜ഑ྻܾఆ ΤϐδΣωςΟΫεղੳ

    ϝλήϊϜղੳ ήϊϜߏ଄ղੳ σʔλղੳπʔϧˍอଘ ౷߹ղੳ એ఻Ͱ͕͢པ·Ε͍ͯΔΘ͚Ͱ΋ചΕΔͱ๻ʹ͓͕ۚೖΔΘ͚Ͱ΋͋Γ·ͤΜ
  69. Read spec, still improving ࢼༀ΍ιϑτ΢ΣΞͷ޲্ʹΑΓಉ͡γʔέϯαͰ΋ Ϧʔυ਺Ϧʔυ௕͕සൟʹมΘΔ ྫJMMVNJOBࣾ.J4FR

  70. ྫϚ΢εͷҨ఻ࢠൃݱͷݚڀྫΛ୳͢ ∑ Example survey: mouse brain transcriptome

  71. Example survey: mouse brain transcriptome ੜ෺छͱ࣮ݧछΛࢦఆ TVCNJUDPOEJUJPOΛԡ͢ http://sra.dbcls.jp/search γʔέϯα͸ۭཝͷ··

  72. ϓϩδΣΫτ͕֘౰ http://sra.dbcls.jp/search/filter?species=Mus %20musculus&type=Transcriptome&instrument= ΩʔϫʔυʹzCSBJOzΛ ೖྗͯ͠zTFBSDIzΛԡ͢ Example survey: mouse brain transcriptome

  73. ϓϩδΣΫτ͕֘౰ http://sra.dbcls.jp/search/search?species=Mus %20musculus&type=Transcriptome&instrument=&search_query=brain 4UVEZ5JUMFͷԼͷೖྗཝʹ lCSBJOzͱೖྗͯ͠ߜΓࠐΈ Example survey: mouse brain transcriptome

  74. ϓϩδΣΫτΛ৽͍͠ॱ ʹฒ΂ΔͨΊ4UVEZ*%Λ ΫϦοΫ ͜ͷϓϩδΣΫτͷσʔλΛ ݟͯΈ·͢ http://sra.dbcls.jp/search/search?species=Mus %20musculus&type=Transcriptome&instrument=&search_query=brain Example survey: mouse

    brain transcriptome
  75. ϓϩδΣΫτͷ֓ཁ http://sra.dbcls.jp/search/view/SRP011204 ϓϩδΣΫτͰߦΘΕͨ γʔέϯεͷ֓ཁ Example survey: mouse brain transcriptome

  76. Ϧʔυ਺˺d. Ϧʔυ௕C UPUBMd(C http://sra.dbcls.jp/search/view/SRP011204 ͭͷ4BNQMFͰ3VO αϯϓϧ෼ׂ SFQMJDBUFT Example survey: mouse

    brain transcriptome
  77. λΠτϧʹ͋ͬͨ(&0*% l(4&zͰݕࡧ http://www.ncbi.nlm.nih.gov/geo/ Example survey: mouse brain transcriptome

  78. SFQMJDBUFTͰͨ͠ http://www.ncbi.nlm.nih.gov/geo/query/ acc.cgi?acc=GSE36232 Example survey: mouse brain transcriptome

  79. (&0Ͱ͸࿦จͷ৘ใ͕ Ξοϓσʔτ͞Ε͍ͯΔ http://www.ncbi.nlm.nih.gov/geo/query/ acc.cgi?acc=GSE36232 ͦΕͧΕͷαϯϓϧͷ ৄ͍͠৘ใ $POUSPMͷ৘ใΛݟΔͨΊ (&04BNQMF*%ΛΫϦοΫ Example survey:

    mouse brain transcriptome
  80. 4BNQMF$IBSBDUFSJTUJDT http://www.ncbi.nlm.nih.gov/geo/query/ acc.cgi?acc=GSM884353 4BNQMFॲཧͷϓϩτίϧ Example survey: mouse brain transcriptome

  81. 43"ͷ&YQFSJNFOU*% http://www.ncbi.nlm.nih.gov/geo/query/ acc.cgi?acc=GSM884353 4BNQMFͷؔ܎ΛݟΔͨΊ #JPTBNQMF*%ΛΫϦοΫ Example survey: mouse brain transcriptome

  82. ରԠ͢Δ43"4BNQMF*% http://www.ncbi.nlm.nih.gov/geo/query/ acc.cgi?acc=GSM884353 43"ϑΥʔϚοτͷ ഑ྻσʔλͷ%-ϦϯΫ Example survey: mouse brain transcriptome

  83. (&0ͷϖʔδʹ໭ͬͯ ࿦จͷϦϯΫΛΫϦοΫ http://www.ncbi.nlm.nih.gov/geo/query/ acc.cgi?acc=GSE36232 Example survey: mouse brain transcriptome

  84. ͔ͤͬ͘ͳͷͰ 1VC3FBEFSͰશจΛ֬ೝ http://www.ncbi.nlm.nih.gov/pubmed/22563483 Example survey: mouse brain transcriptome

  85. /BWJHBUJPOΛΫϦοΫ .BUFSJBMT.FUIPETΛ ΫϦοΫ http://www.ncbi.nlm.nih.gov/pmc/articles/ PMC3341364/?report=reader Example survey: mouse brain transcriptome

  86. σʔλղੳʹ͍ͭͯ ར༻ͨ͠πʔϧͳͲ http://www.ncbi.nlm.nih.gov/pmc/articles/ PMC3341364/?report=reader ϥΠϒϥϦௐ੔ͱ γʔέϯγϯάʹ͍ͭͯ Example survey: mouse brain

    transcriptome
  87. ∑ ৚݅ʹ߹͏σʔλͷϦʔυͷ৘ใΛಘΔ ϥΠϒϥϦௐ੔΍σʔλղੳͷ৘ใΛಘΔ ৚݅ʹ߹ͬͨσʔλΛμ΢ϯϩʔυ Practical search tips Ϧʔυͷ௕͞ɼϦʔυ਺ɼαϯϓϧͷ৘ใͳͲɽ ར༻໨తʹ߹͍ͬͯΔ͔ɼσʔλͷ਺͸े෼͔ɽ 43"ʹهࡌ͕͋Δ͜ͱ͸ଟ͘ͳ͍ɽ

    ࿦จ΍(&0ͳͲ֎෦%#ͷ৘ใΛ୧Δ͜ͱͰಘΒΕΔ͜ͱ΋ɽ σʔλʹΑͬͯ͸%-ɼϑΝΠϧల։ʹඇৗʹ͕͔͔࣌ؒΔɽ %%#+'51ͰGBTURΛ%-ɼ΋͘͠͸%%#+ύΠϓϥΠϯΛར༻ɽ
  88. σʔλͷ֬ೝͱμ΢ϯϩʔυ ∑ Quality check and download

  89. Read quality check ϦʔυͷҐஔ͝ͱͷ ΫΦϦςΟΛνΣοΫ http://sra.dbcls.jp/search/view/SRR426841 ($ͳͲ΋νΣοΫ

  90. Data download via FTP l'51zΛΫϦοΫ http://sra.dbcls.jp/search/view/SRP011204 %#ܗࣜΛબ୒͢Δͱ '51αΠτ͕։͘

  91. Data download via FTP http://trace.ddbj.nig.ac.jp/DRASearch/run?acc=SRR426841 '"45243"-JUF ͲͪΒ͔ͷܗࣜΛΫϦοΫ

  92. '51αΠτʹήετͰϩάΠϯ C[ܗࣜͰѹॖ͞ΕͨGBTURϑΝΠϧʹΞΫηεͰ͖·͢ Data download via FTP

  93. ύΠϓϥΠϯΛར༻͢Δ ∑ DDBJ Read Annotation Pipeline

  94. DDBJ Read Annotation Pipeline ϩάΠϯޙɼ l*NQPSUQVCMJD%3"zΛ ΫϦοΫ https://p.ddbj.nig.ac.jp/ˠϩάΠϯ 43"*%Λೖྗͯ͠ σʔλΛύΠϓϥΠϯʹ௥Ճ

  95. ·ͱΊ Summary #3 X

  96. X Summary #3 จݙ΍Ϧʔυ৘ใΛ׆༻ͯ͠ඞཁͳ৘ใΛಘΔ Ϧʔυͷ৘ใ΍ϥΠϒϥϦௐ੔ɾղੳͳͲͷ৘ใ͕ඞཁɽ Ͳ͏ͯ͠΋৘ใ͕ݟ͔ͭΒͳ͍࣌͸ఘΊΔͷ΋େࣄɽ ެڞͷղੳύΠϓϥΠϯΛ͏·͘ར༻͢Δ ڊେͳσʔλ͸%-ʹ͕͔͔࣌ؒΓɼ)%%༰ྔ΋ѹഭ͢Δɽ %%#+ύΠϓϥΠϯΛ׆༻͢Δ͜ͱͰίετΛԼ͛ΒΕΔɽ

  97. ΦϯϥΠϯͰඞཁͳ৘ใΛ୳͢ Œ Online Reference

  98. IUUQHJUIVCDPNJOVUBOPTSB@NFUBEBUB@UPPMLJUXJLJ 43" /(4ʹؔ͢ΔϦϑΝϨϯεͱϦϯΫू Online Reference

  99. ࣭ٙԠ౴ Thank you for your attention ¿ ·ͨ͸࣭໰͸UPIUB!ECDMTSPJTBDKQ·Ͱ