Public data repository and analysis pipeline for high-throughput sequencing

Public data repository and analysis pipeline for high-throughput sequencing

特定非営利活動法人酵母細胞研究会 第186回例会 次世代シーケンサーを活用した研究事例と、それを支える公共ツール・データベース

991f3366d9cc17386e6a66ef4abc6dbc?s=128

Tazro Inutano Ohta

July 11, 2014
Tweet

Transcript

  1. ࣍ੈ୅γʔέϯαʔΛར༻ͨ͠ݚڀࣄྫͱͦΕΛࢧ͑Δެڞπʔϧɾσʔλϕʔε Public data repository and analysis pipeline for high-throughput sequencing

    ৘ใɾγεςϜݚڀػߏ ϥΠϑαΠΤϯε౷߹σʔλϕʔεηϯλʔ େా ୡ࿠ <t.ohta@dbcls.rois.ac.jp> ! prepared for ୈ186ճ ߬฼ࡉ๔ݚڀձ ྫձ July 11, 2014
  2. Agenda ‣ %#$-4ͱ౷߹%#ϓϩδΣΫτʹ͍ͭͯ ‣ /(4ʹؔ࿈͢Δσʔλϕʔε ‣ /(4Λ࢖ͬͨݚڀϑϩʔʹ͓͚Δެ։%#ͷ໾ׂ ‣ ެڞσʔλͷݕࡧ͔Βղੳ·Ͱ

  3. DBCLSͱ౷߹σʔλϕʔεϓϩδΣΫτʹ͍ͭͯ Database Integration Project and DBCLS

  4. DBCLSͱ౷߹σʔλϕʔεϓϩδΣΫτʹ͍ͭͯ ‣ େֶڞಉར༻ػؔ๏ਓ৘ใɾγεςϜݚڀػߏ 30*4 ࡿԼ ‣ +45ࡿԼͷ/#%$ ಉ͘͡30*4ࡿԼͷҨ఻ݚ%%#+ͱ࿈ܞ ‣ /#%$͸ϑΝϯσΟϯάɼ%%#+͸σʔλΞʔΧΠϒɼ%#$-4͸ٕज़։ൃΛ୲౰

    ϥΠϑαΠΤϯε౷߹σʔλϕʔεηϯλʔ %#$-4 ͸ɺੜ໋Պֶ෼໺ʹ͓͚Δ σʔλެ։ͷଅਐͱσʔλϕʔεߏஙʹࢿ͢Δٕज़ͷݚڀ։ൃΛߦ͏ݚڀॴͰ͢ɻ
  5. http://dbcls.rois.ac.jp/about

  6. DBCLSͱ౷߹ϓϩδΣΫτ: ͜Ε·Ͱʹ։ൃɾӡ༻͖ͯͨ͠αʔϏε ‣ *OUFHCJPσʔλϕʔεΧλϩά ‣ ੜ໋ՊֶσʔλϕʔεΞʔΧΠϒ ‣ ͦͷଞɼσʔλϕʔεԣஅݕࡧͳͲ%#౷߹ʹࢿ͢ΔαʔϏε ‣ ݸผʹݚڀ։ൃΛߦ͍ͬͯΔٕज़ͷԠ༻ͱͯ͠ͷαʔϏε

    ‣ UPHPHFOPNF ((3/" 3FG&Y ৽ண࿦จϨϏϡʔ ౷߹57 *O.F9FT FUD
  7. Database of Databases: Integbio DBcatalog http://integbio.jp/dbcatalog

  8. ੜ෺छ΍ΧςΰϦʹΑΔߜࠐ͕Մೳ http://integbio.jp/dbcatalog

  9. DBͷҡ࣋ɼҾ͖ड͚·͢ http://dbarchive.biosciencedbc.jp/

  10. ҰׅDL΍ར༻ڐ୚؅ཧΛαϙʔτ http://dbarchive.biosciencedbc.jp/

  11. Find more at http://biosciencedbc.jp

  12. togogenome.org ggrna.dbcls.jp refex.dbcls.jp first.lifesciencedb.jp togotv.dbcls.jp docman.dbcls.jp/im

  13. togogenome.org ggrna.dbcls.jp refex.dbcls.jp first.lifesciencedb.jp togotv.dbcls.jp docman.dbcls.jp/im ήϊϜ৘ใ/ՄࢹԽ ߴ଎Ԙج഑ྻݕࡧ Ҩ఻ࢠൃݱϦϑΝϨϯε ೔ຊޠ࿦จϨϏϡʔ

    ಈըνϡʔτϦΞϧ ࿦จࣥචαϙʔτ
  14. Find more at http://dbcls.rois.ac.jp/services

  15. NGSʹؔ࿈͢Δσʔλϕʔε Data Repositories and Databases for high-throughput sequencing

  16. NGSʹؔ࿈͢Δσʔλϕʔεɾެ։σʔλϨϙδτϦ ‣ ࠃࡍԘج഑ྻσʔλϕʔεͱ4FRVFODF3FBE"SDIJWF ‣ ڊେϓϩδΣΫτʹΑΔσʔλϗεςΟϯά

  17. ࠃࡍԘج഑ྻσʔλϕʔεͱSequence Read Archive ‣ */4%$*OU`M/VDMFPUJEF4FRVFODF%BUBCBTF$PMMBCPSBUJPO ‣ /$#* &#* %%#+ہͷ୲౰νʔϜ͕ڞಉͰӡ༻ ‣

    4FRVFODF3FBE"SDIJWF/(4ͷͨΊͷ1SJNBSZEBUBSFQP www.insdc.org
  18. ڊେϓϩδΣΫτʹΑΔσʔλϗεςΟϯά ‣ ن໛ͷେ͖ͳϓϩδΣΫτͰ͸ࣗΒσʔλΛެ։͢Δ৔߹͕͋Δ ‣ (FOPNFT1SPKFDUIUUQHFOPNFTPSH ‣ 5IF$BODFS(FOPNF"UMBT1SPKFDUIUUQUDHBEBUBODJOJIHPW ‣ &/$0%&1SPKFDUIUUQHFOPNFVDTDFEVFODPEF ‣

    σʔλͷίϐʔ͕Ϋϥ΢υαʔϏε্ʹެ։͞Ε͍ͯΔ͜ͱ΋ ‣ HFOPNFTPO"84IUUQBXTBNB[PODPNHFOPNFT
  19. σʔλͱσʔλϕʔεͷ֊૚ʹ͍ͭͯ Knowledge Summarised Data Experimental Data Knowledge-base Database Primary Data

    Repository Biological Information “Database”
  20. NGSʹؔ࿈͢Δσʔλϕʔε Knowledge-base Database Primary Data Repository

  21. NGSʹؔ࿈͢Δσʔλϕʔε Knowledge-base Database Primary Data Repository

  22. NGSΛ࢖ͬͨݚڀϑϩʔʹ͓͚Δެ։DBͷ໾ׂ The role of database for each steps of sequencing

    research procedure
  23. ҰൠతͳNGSΛ༻͍ͨݚڀϑϩʔ ࣮ݧσβΠϯ ༧උ࣮ݧ αϯϓϦϯά DNAௐ੔ ϥΠϒϥϦ࡞੡ γʔέϯε QC ϑΟϧλϦϯά alignment/assemble

    QC ໨తผղੳ ֬ೝ࣮ݧ σʔλެ։ ࿦จ౤ߘ ϦόΠζ/௥Ճ࣮ݧ ΞΫηϓτ ҿΈձ
  24. ެڞσʔλϕʔε͕ؔΘΔεςοϓ ࣮ݧσβΠϯ ༧උ࣮ݧ αϯϓϦϯά DNAௐ੔ ϥΠϒϥϦ࡞੡ γʔέϯε QC ϑΟϧλϦϯά alignment/assemble

    QC ໨తผղੳ ֬ೝ࣮ݧ σʔλެ։ ࿦จ౤ߘ ϦόΠζ/௥Ճ࣮ݧ ΞΫηϓτ ҿΈձ
  25. ެڞσʔλΛར༻ͨ͠NGSݚڀͷσβΠϯ ‣ γʔέϯεલͷ࣮ݧσβΠϯ ‣ ྨࣅσʔλΛղੳ͢Δ͜ͱͰγʔέϯεޙͷྲྀΕΛςετ͢Δ ‣ γʔέϯεޙɺσʔλղੳͰ ‣ γʔέϯε݁Ռͷଥ౰ੑΛݕ౼͢Δ ‣

    ࣗલͷσʔλͱൺֱղੳΛߦ͏ ‣ σʔλղੳޙɺ੒Ռൃදͷ৔Ͱ ‣ σʔλΛϨϙδτϦʹެ։͢Δ
  26. ެڞσʔλͷݕࡧ͔Βղੳ·Ͱ Search, Download, and Data Analysis of Public Sequencing Data

  27. ެڞσʔλͷμ΢ϯϩʔυ͔Βղੳ·Ͱ ‣ ϨϙδτϦͷݕࡧػೳͰ୳͢ ‣ /$#* &#* %%#+ͷݕࡧΛར༻͢Δ ‣ σʔλͷ*%͕ࣄલʹ෼͔͍ͬͯΔ৔߹ʹ༗ޮ ‣

    ࿦จ΍࣬ױͳͲͷؔ࿈৘ใ͔Β୳͢ ‣ %#$-443"Λར༻͢Δ ‣ ެڞͷղੳαʔϏεΛར༻ͯ͠ղੳ͢Δ ‣ %%#+3FBE"OOPUBUJPO1JQFMJOF ‣ .VEJ.VUBUJPO%JTDPWFSZJOZFBTU
  28. ϨϙδτϦͷݕࡧػೳͷ࢖͍ํ - github.com/inutano/sra_metadata_toolkit/wiki

  29. DBCLS SRAΛར༻͢Δ - http://sra.dbcls.jp

  30. DBCLS SRAΛར༻͢Δ - http://sra.dbcls.jp

  31. ࿦จ͔Β୳͢ - http://sra.dbcls.jp/cgi-bin/publication.cgi

  32. Ωʔϫʔυશจݕࡧ - http://sra.dbcls.jp/search

  33. Ωʔϫʔυશจݕࡧ - http://sra.dbcls.jp/search

  34. Ωʔϫʔυશจݕࡧ - http://sra.dbcls.jp/search

  35. Ωʔϫʔυશจݕࡧ - http://sra.dbcls.jp/search

  36. Ωʔϫʔυશจݕࡧ - http://sra.dbcls.jp/search

  37. ެڞNGSղੳύΠϓϥΠϯ DDBJ Read Annotation Pipeline - http://p.ddbj.nig.ac.jp

  38. ެڞNGSղੳύΠϓϥΠϯ DDBJ Read Annotation Pipeline - http://p.ddbj.nig.ac.jp

  39. ࢖͍ํ͸DDBJߨशձͰ (ࢿྉ΍࿥ը΋ެ։͞Ε͍ͯ·͢) http://www.ddbj.nig.ac.jp/ddbjing/

  40. Mudi: Mutation discovery in yeast - http://naoii.nig.ac.jp/mudi_top.html

  41. Mudi: Mutation discovery in yeast - http://naoii.nig.ac.jp/mudi_top.html

  42. Summary ‣ %#$-4ͱ౷߹%#ϓϩδΣΫτ͸ࠃ಺ͷੜ໋ՊֶϦιʔεΛ
 ੔උɾ౷߹͍ͯ͠·͢ ‣ ެڞ%#Ͱެ։͞ΕͨσʔλΛ༗ޮʹར༻͢Δ͜ͱͰ
 ݚڀϑϩʔͷޮ཰ԽΛਤΕ·͢ ‣ ݕࡧ΍ղੳʹެڞαʔϏεΛར༻͢Δ͜ͱͰ
 σʔλղੳͷ௿ίετԽ͕ਤΕ·͢

  43. Thank you! ͝ਗ਼ௌ͋Γ͕ͱ͏͍͟͝·ͨ͠ ! t.ohta@dbcls.rois.ac.jp http://speakerdeck.com/inutano