Massive parallel processing of public high-throughput sequencing data and experiment of sharing data analysis environment

Massive parallel processing of public high-throughput sequencing data and experiment of sharing data analysis environment

NIG/DDBJ supercomputer user meeting at National Institute of Genetics

991f3366d9cc17386e6a66ef4abc6dbc?s=128

Tazro Inutano Ohta

July 22, 2014
Tweet

Transcript

  1. େྔ/(4σʔλͷฒྻॲཧͱڞ༻εύίϯʹ͓͚Δ؀ڥߏஙͷࠓޙʹ͍ͭͯ ৘ใɾγεςϜݚڀػߏ ϥΠϑαΠΤϯε౷߹σʔλϕʔεηϯλʔ େా ୡ࿠ <t.ohta@dbcls.rois.ac.jp> ! prepared for Ҩ఻ݚDDBJεύίϯϢʔβձ

    July 22, 2014
  2. Summary ‣ Ҩ఻ݚεύίϯΛར༻͠ެ։/(4σʔλશͯʹରͯ͠
 όονॲཧΛߦ͍ɼ%#ͷߏஙΛߦ͍ͬͯ·͢ ! ‣ σʔλղੳύΠϓϥΠϯͷڞ༗ɾ࠶࣮ߦΛߦ͏ͨΊͷ
 7.ίϯςφΛར༻ͨ͠؀ڥߏஙͷௐࠪɾ։ൃΛߦ͍ͬͯ·͢

  3. sra.dbcls.jp

  4. ‣ ެ։/(4σʔλʹରͯ͠'BTU2$Λ࣮ߦ݁͠ՌΛճऩɾूܭ ‣ %-Մೳͳσʔλશ͕ͯର৅ ‣ ʙ೥ొ࿥෼·Ͱ׬ྃ ‣ ૯σʔλ਺ ‣ 

    4FRVFODF3VO TJOHMFPSQBJSFE  ‣ ૯σʔλαΠζ ‣      5 Ԙجର ެ։NGSσʔλͷϦʔυΫΦϦςΟDB
  5. ‣ σʔλసૹ଎౓ ‣ MGUQNHFUʹΑΔ(#ͷσʔλసૹ Y  ‣ ಉ࣌ฒྻ࣮ߦ਺ ‣ $16$16

    Y طଘܭࢉػ؀ڥͱͷࠩ
  6. ‣ ιϑτ΢ΣΞͷόʔδϣϯ؅ཧͷ໰୊ ‣ ڞ༻؀ڥͰ͸Πϯετʔϧ͕೉͍͠৔߹΋͋Δ ‣ ݱঢ়͸౦େּݪ͞Μͷ-1.Λ࢖Θͤͯ௖͘ͳͲͰճආ ‣ IUUQXXXLBTBIBSBXTMQN ‣ େྔͷσʔλʹରͯ͠ͻͱͭͻͱͭख࡞ۀʁ

    ՝୊: ࿦จʹॻ͔ΕͨύΠϓϥΠϯΛ࠶ݱ͢Δ͜ͱ͕ࠔ೉
  7. ‣ 7JSUVBM.BDIJOF 7. ΍ίϯςφͰ؀ڥ͝ͱղੳύΠϓϥΠϯΛڞ༗ ‣ ΠϝʔδΛల։͙ͯ͢͠ʹղੳΛ࢝ΊΔ͜ͱ͕Ͱ͖Δ ‣ ؀ڥߏஙͱΠϝʔδڞ༗ͷٕज़ௐࠪ։ൃΛߦ͍ͬͯ·͢ ‣ "NB[PO8FC4FSWJDFʹ͓͚Δ".*ͷڞ༗

    ‣ %PDLFS)VCʹ͓͚ΔίϯςφΠϝʔδͷڞ༗ ‣ Ҩ఻ݚεύίϯͰ΋͜ΕΒͱޓ׵ੑΛ͍࣋ͨͤͨ σʔλղੳͷ࠶ݱੑΛ୲อ͢ΔͨΊͷղܾࡦ
  8. ίʔυ΍ιϑτ΢ΣΞͱಉ͡Α͏ʹղੳ؀ڥΛެ։/ڞ༗

  9. ίʔυ΍ιϑτ΢ΣΞͱಉ͡Α͏ʹղੳ؀ڥΛެ։/ڞ༗ $ docker run -d -p 8080:80 -t inutano/galaxy

  10. ‣ Πϝʔδڞ༗Ͱ؀ڥ΁ͷґଘ͕ͳ͘ͳΔͱબ୒ࢶ͕૿͑Δ ‣ ࣗ෼Ͱߪೖͨ͠ܭࢉػ ‣ Ҩ఻ݚεύίϯͳͲͷڞ༻ܭࢉػϦιʔε ‣ "NB[PO8FC4FSWJDF "84 ͳͲͷ*OGSBTUSVDUVSFBTB4FSWJDF

    *BB4  ‣ ܾΊख͸ಋೖͷίετͱϚγϯߏ੒ɼίετ ‣ "84ͷίετ͕͔ͳΓԼ͕ͬͨͨΊબ୒ࢶͱͯ͠ݱ࣮తʹ ‣ ϧʔνϯͳܭࢉ͸Ҩ఻ݚεύίϯͰ ͨͩͳͷͰ ܭࢉػϓϥοτϑΥʔϜͷબ୒
  11. ॳظಋೖίετ ҡ࣋ίετ ߏ੒ͷॊೈੑ ৴པੑ/Ӭଓੑ ൿಗੑ ಛ௃ ݸผಋೖ ✕ ✕ ̋

    ˚ ̋ ࢿۚ͋Ε͹੍໿ͳ͠ ڞ༻ܭࢉػࢿݯ (NIGεύίϯ) ̋ ̋ ˚ ˚ ✕ DDBJͷDBͱ௚݁ IaaS (Ϋϥ΢υ) ̋ ˚ ̋ ˚ ˚ ඞཁͳ࣌ʹඞཁͳ͚ͩ ίετ΋೥ʑԼ͕Δ Ϣʔβࢹ఺Ͱͷ֤ܭࢉػ؀ڥͷϝϦοτൺֱ
  12. Summary ‣ Ҩ఻ݚεύίϯΛར༻͠ެ։/(4σʔλશͯʹରͯ͠
 όονॲཧΛߦ͏͜ͱͰ%#ͷߏஙΛߦ͍ͬͯ·͢ ! ‣ σʔλॲཧղੳύΠϓϥΠϯͷอଘӬଓԽ࠶࣮ߦΛߦ͏ͨΊͷ
 7.ίϯςφΛར༻ͨ͠؀ڥߏஙͱެ։%#ͷௐࠪɾ։ൃΛߦ͍ͬͯ·͢