Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
クラウドを活用したゲノム情報解析の現状
Search
Tazro Inutano Ohta
July 22, 2016
Research
2
420
クラウドを活用したゲノム情報解析の現状
情報処理学会 連続セミナー 2016 第2回 クラウド
http://www.ipsj.or.jp/event/seminar/2016/program02.html
Tazro Inutano Ohta
July 22, 2016
Tweet
Share
More Decks by Tazro Inutano Ohta
See All by Tazro Inutano Ohta
Yevis: System to support building a workflow registry with automated quality control
inutano
0
96
Standardization of biological sample information database
inutano
0
45
Describe data analysis workflow with workflow languages
inutano
4
4.2k
Container virtualization technologies and workflow languages improve portability and reproducibility of data analysis environment
inutano
3
320
次世代シーケンサーによるメタゲノム解析:桜の花びらに付着した環境DNAを解析する
inutano
0
71
Workflows that run everywhere and where to run them
inutano
0
130
The Sequence Read Archive search system to make use of public high-throughput sequencing data
inutano
0
240
Improve portability of bioinformatics software across HPC and cloud infrastructures
inutano
1
86
Container, Cloud, and HPC
inutano
0
150
Other Decks in Research
See All in Research
SSII2024 [OS2] 大規模言語モデルと基盤モデルの射程
ssii
PRO
0
380
1on1ガイドへの想い(chachaki編)
chachakix
0
150
Mathematical Optimization +Artificial Intelligence =MOAI
mickey_kubo
1
230
Introduction of NII S. Koyama's Lab (AY2024)
skoyamalab
0
330
SSII2024 [OS2] GPT-4Vで画像認識は終わるのか(オープニング)
ssii
PRO
0
640
バスのサービスレベル向上と運賃策による熊本都市圏の渋滞緩和効果推計 ~公共交通への公的投資に向けた感度と集計QVに基づく迅速なシナリオ検討~
trafficbrain
0
180
出生抑制策と少子化
morimasao16
0
280
自動運転・AIシステムの問題を賢く探す・賢く直す / Smart Search & Repair Techniques for Automated Driving Systems and AI Systems
ishikawafyu
0
140
HP (Hitto Point: 筆頭ポイント)
tanichu
0
900
継続的な研究費獲得のための考え方
moda0
2
540
一人称視点映像解析の基礎と応用(CVIMチュートリアル)
takumayagi
0
800
SSII2024 [OS3] 企業における基盤モデル開発の実際
ssii
PRO
0
490
Featured
See All Featured
For a Future-Friendly Web
brad_frost
173
9.2k
Rails Girls Zürich Keynote
gr2m
93
13k
Robots, Beer and Maslow
schacon
PRO
157
8.1k
GraphQLの誤解/rethinking-graphql
sonatard
59
9.6k
Making the Leap to Tech Lead
cromwellryan
127
8.7k
Visualization
eitanlees
139
14k
Scaling GitHub
holman
458
140k
Docker and Python
trallard
37
2.9k
Adopting Sorbet at Scale
ufuk
71
8.8k
JavaScript: Past, Present, and Future - NDC Porto 2020
reverentgeek
44
4.7k
The Invisible Side of Design
smashingmag
294
50k
Unsuck your backbone
ammeep
666
57k
Transcript
ΫϥυΛ׆༻ͨ͠ήϊϜใղੳͷݱঢ় 22 July 2016 | ใॲཧֶձ ࿈ଓηϛφʔ 2016 ୈ2ճ Ϋϥυ
େా ୡ! େֶڞಉར༻ػؔ๏ਓ ใɾγεςϜݚڀػߏ " σʔλαΠΤϯεڞಉར༻ج൫ࢪઃ " ϥΠϑαΠΤϯε౷߹σʔλϕʔεηϯλʔ ಛݚڀһ"
[email protected]
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS)
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) Agenda! #
1. ࠓੜ໋ՊֶͷͰԿ͕ى͖͍ͯΔͷ͔" # 2. ࠓͲͷΑ͏ͳܭࢉػ͕ٻΊΒΕ͍ͯΔͷ͔" # 3. ΫϥυΛ׆༻ͯ͠Λղܾ͢Δ
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) 1. ࠓੜ໋ՊֶͷͰԿ͕ى͖͍ͯΔͷ͔
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) ࠓੜ໋ՊֶͷͰԿ͕ى͖͍ͯΔͷ͔ #
࣮ݧػցͷਐาʹΑͬͯσʔλͷαΠζͱྔ͕૿Ճ" # ήϊϜͰʮ࣍ੈDNAγʔΫΤϯαʔʯ͕ొ" # σʔλͷੵʹΑͬͯܭࢉػੜֶ͕Μʹͳ͍ͬͯΔ" # λϯύΫཱ࣭ମߏσʔλɺը૾σʔλ" # σʔλॲཧɾղੳͷޮԽࠓͳ͓ٸ" # ΞϧΰϦζϜͷਐาΛ͍ͬͯΔ࣌ؒͳ͍" # ϋʔυΣΞͷੑೳͰΛղܾ͢Δ߹
λϯύΫཱ࣭ମߏղੳͷྫ! MEGADOCK: ౦େळࢁݚڀࣨ େ্ॿڭΒͷϓϩδΣΫτ http://www.nii.ac.jp/csi/openforum2016/track/pdf/20160526AM_TOUKOUDAI_akiyama2.pdf
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) ήϊϜՊֶͷͰԿ͕ى͖͍ͯΔͷ͔ #
࣮ݧػցͷਐาΛཚʹྫ͑ΔͳΒ…" # ւ = ήϊϜ, ڕ = Ҩࢠ" # ʮͲΜͳڕ͕͍Δ͔ௐΔ͜ͱͰւΛಛ͚ͮΔʯ" # ٕज़ͷਐาͰಓ۩ͷੑೳ্͕ͨ͠" # Γ͕ఈҾ͖ʹͳͬͨ
ࣸਅࠨ: πϦόΧϝϥ @kazzwatabe https://tsuriba.camera/posts/XQeP3qmIp6A ࣸਅӈ: photo by atramos https://www.flickr.com/photos/atramos/5508960637 ࣮ݧػց͕ਐา͢Δͱ݁Ռͷղऍʹίετ͕͔͔Δ
͜Ε·ͰͷDNAγʔέϯαʔͷग़ྗσʔλͰݟͯ֬ೝͰ͖ͨ ࠓͷDNAγʔέϯαʔͷग़ྗܭࢉػ͕ͳ͍ͱԿͰ͖ͳ͍
https://flxlexblog.wordpress.com/2014/06/11/developments-in-next-generation-sequencing-june-2014-edition/ ← ←ఈҾ͖ DNAγʔέϯα ػछ͝ͱͷੑೳൺֱ
None
http://www.ncbi.nlm.nih.gov/Traces/sra/ ެڞσʔλϨϙδτϦͷσʔλαΠζͷ৳ͼ
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) DNAγʔέϯα͔ΒಘΒΕΔσʔλ #
ʮήϊϜΛղಡ͢ΔʯͱҰݴͰݴ͏ͷͷ…" # ੜମαϯϓϧ͔ΒDNAΛநग़͢Δ" # நग़ͨ͠DNAΛ͍ࢠʹஅยԽ͢Δ" # DNAγʔέϯαͰղੳ͢Δ" # ͘அยԽ͞ΕͨԘجྻͷϦετͰग़ྗ͞ΕΔ" # େྔͷDNAஅยͷใ͔ΒݩͷDNAΛ෮ݩ͢Δ! # de novo Assemble" # Reference Alignment" "
https://speakerdeck.com/michaelbarton/ranking-genome-assemblers-with-docker-containers-dockercon-eu-2014 DNAγʔέϯα͔Βग़ྗ͞ΕΔσʔλஅยԽ͍ͯ͠Δ
https://speakerdeck.com/michaelbarton/ranking-genome-assemblers-with-docker-containers-dockercon-eu-2014 DNAγʔέϯαΛγϡϨομʔʹྫ͑Δͱ…
https://speakerdeck.com/michaelbarton/ranking-genome-assemblers-with-docker-containers-dockercon-eu-2014 ήϊϜΞηϯϒϧ = ຊͷ෮ݩ
https://speakerdeck.com/michaelbarton/ranking-genome-assemblers-with-docker-containers-dockercon-eu-2014 ήϊϜΞηϯϒϧ = ຊͷ෮ݩ
http://www.historyofnimr.org.uk/mill-hill-essays/essays-yearly-volumes/2010-2/bringing-it-all-back-home-next-generation- sequencing-technology-and-you/ ϦϑΝϨϯεΞϥΠϯϝϯτ! = खຊ (ϦϑΝϨϯε) ʹԊͬͯฒͯ෮ݩ
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) σʔλղੳιϑτΣΞ (ղੳπʔϧ)
# ଟ͘ͷղੳπʔϧ͕ΦʔϓϯιʔεͰެ։͞Ε͍ͯΔ" # ରσʔλͷੑ࣭ʹΑͬͯ࠷దͳπʔϧ͕ҟͳΔ" # σʔλղੳऀ (ੜֶऀ) ͕σʔλղੳΛߦ͏" # πʔϧ։ൃऀ(࣮ऀ)ͱར༻ऀಉҰͰͳ͍" # ར༻ऀ͕πʔϧͷڍಈΛશʹѲ͍ͯ͠ΔͱݶΒͳ͍" # ղੳऀৗʹσʔλղੳΛ͍ͯ͠ΔΘ͚Ͱͳ͍" # ੜ࣮ݧͷยखؒʹղੳΛ͢Δݚڀऀଟ͍
ΦʔϓϯιʔεͰެ։͞Ε࣮ͨΛ༻͍ͯղੳ https://omictools.com/de-novo-genome-sequencing-category
ΦʔϓϯιʔεͰެ։͞Ε࣮ͨΛ༻͍ͯղੳ https://omictools.com/whole-genome-resequencing-category
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) ࠓੜ໋ՊֶͷͰԿ͕ى͖͍ͯΔͷ͔! #
·ͱΊ" # σʔλͷྔͱ͕ٸܹʹ૿͓͑ͯΓɺࠓޙ૿͑Δ" # తʹΑͬͯҟͳΔπʔϧɾΞϧΰϦζϜ͕༻͞ΕΔ" # σʔλղੳऀͱπʔϧ։ൃऀ(࣮ऀ)ҟͳΔ͜ͱ͕ଟ͍
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) 2. ࠓͲͷΑ͏ͳܭࢉػ͕ٻΊΒΕ͍ͯΔͷ͔
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) ࠓͲͷΑ͏ͳܭࢉػ͕ΘΕ͍ͯΔͷ͔ #
PC" # PCΫϥελ" # ڌεύίϯ" # ࠃཱҨֶݚڀॴ εʔύʔίϯϐϡʔλγεςϜ
࣍ੈγʔΫΤϯαʔ%3:ղੳڭຊ ࡉ๔ֶผ ΑΓ ڭຊʹMacΛങ͑ͱॻ͍ͯ͋Δ
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) ͲͷΑ͏ͳܭࢉػ͕ٻΊΒΕΔͷ͔ #
ରσʔλ͕େ͖͘ͳΔ/૿͑Δͱ௨ৗͷPCͰݫ͍͠" # ղੳσʔλ͕ͲΜͲΜཷ·Δ" # ಡΈॻ͖͕ߴͰڊେͳετϨʔδ! # πʔϧ͕Out of memoryͰམͪΔ" # େنϝϞϦ! # όονॲཧΛେྔͷαϯϓϧʹର࣮ͯ͠ߦ͢Δ" # ࢄ࣮ߦδϣϒεέδϡʔϦϯάγεςϜ! # େܕڞ༻ܭࢉػͷཁٻͷߴ·Γ" # ҨֶݚڀॴSCͷಋೖ (2012~) => ·ͩेͰͳ͍
େֶڞಉར༻ػؔ๏ਓ ใɾγεςϜݚڀػߏ ࠃཱҨֶݚڀॴ SuperComputer Facilities of National Institute of Genetics
photo from http://sc.ddbj.nig.ac.jp/index.php/ja-gallery
None
૿͑ଓ͚ΔϢʔβ ҨݚDDBJηϯλʔ খּݪ͞ΜͷൃදࢿྉΑΓ
ṧഭ͢ΔσΟεΫ https://sc.ddbj.nig.ac.jp/index.php/ja-nig-statistics
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) ݱͰԿ͕ϘτϧωοΫͳͷ͔! εύίϯϢʔβձͳͲͷώΞϦϯάΑΓ
# ܭࢉػʹෆ׳ΕͳϢʔβͷΈ" # ܭࢉػ͝ͱʹԿ͕Ͱ͖ͯԿ͕Ͱ͖ͳ͍ͷ͔Θ͔Βͳ͍" # େنͳܭࢉػΛඞཁͱ͢Δ͕CUI͕͑ͳ͍" # ܭࢉػΛ͍͜ͳ͢ਓͷΈ" # ܭࢉػ͕ࠞΜͰ͍ͯδϣϒ͕ྲྀͤͳ͍" # σʔλͷղੳอଘʹेʹ༧ࢉΛೖͰ͖ͳ͍! # ڥߏஙʹίετ͕͔͔Δ" # ܭࢉػͷ໘Λݟͨ͘ͳ͍
ʮੜ࣮ݧʹ͓͕͔͔ۚΔ͕ɺ ɹσʔλղੳʹͦΕ΄Ͳ͓͕͔͔ۚΒͳ͍ʯͱࢥΘΕ͍ͯΔ http://trattoriainutano.tumblr.com/post/132214903857/
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) ͲͷΑ͏ͳܭࢉػ͕ٻΊΒΕ͍ͯΔͷ͔ #
·ͱΊ" # ରσʔλͱతʹΑͬͯཁٻʹ͕ࠩ͋Δ" # ήϊϜͰετϨʔδϝϞϦͷΈ͕ਂࠁ" # ϢʔβͷܭࢉػϦςϥγʹ෯͕͋Δ" # ϢʔβͷϨϕϧʹΑͬͯٻΊΔϨΠϠʔ͕ҧ͏
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) 3. ΫϥυΛ׆༻ͯ͠Λղܾ͢Δ
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) ΫϥυΛ׆༻ͯ͠Λղܾ͢Δ #
ΫϥυͰղܾͰ͖Δ" # ಋೖίετ" # ϊʔυͷࠞࡶ" # ϝϯςφϯείετ" # Ϋϥυར༻ʹ͓͚Δ՝" # ετϨʔδͷίετ" # ݚڀඅͰͷࢧ͍" # ະൃදσʔλ / ݸਓใΛؚΉσʔλͷѻ͍
Ϋϥυ׆༻ࣄྫ (SaaS)! Google Genomics https://cloud.google.com/genomics/v1/analyze-variants
Ϋϥυ׆༻ࣄྫ (IaaS)! 1000ਓήϊϜσʔλ on AWS https://aws.amazon.com/jp/1000genomes/
The NIH Commons! ถࠃͰϑΝϯσΟϯάଆ͕Ϋϥυར༻Λଅਐ “The Commons is a shared virtual
space where scientists can work with the digital objects of biomedical research, i.e. it is a system that will allow investigators to find, manage, share, use and reuse data, software, metadata and workflows.” - https://datascience.nih.gov/ commons
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) Ϋϥυ׆༻ࣄྫ (PaaS/SaaS)!
ήϊϜղੳύΠϓϥΠϯ on ΞΧσϛοΫɾΠϯλʔΫϥυ # JST CREST: ΠϯλʔΫϥυΛ׆༻ͨ͠ΞϓϦέʔγϣϯத৺ܕΦʔόʔ ϨΠΫϥυٕज़ʹؔ͢Δݚڀ (ද: NII߹ాઌੜ)" # ΞΧσϛοΫɾΠϯλʔΫϥυͷࢼΈ" # ҨݚεύίϯΛใݚΫϥυଞࠃͷΞΧσϛοΫΫϥυͱ࿈ܞ" # ղੳʹ༻͍ΒΕΔ֤πʔϧΛDockerԽ͢Δ͜ͱͰΞϓϦέʔγϣϯΛ ϙʔλϒϧʹ" # ༧ΊπʔϧΛΈ߹ΘͤͨϫʔΫϑϩʔΛߏங͠GUIΛఏڙ" # ղੳσʔλ͝ͱʹ࠷దͳϦιʔεΛׂΓͯͨܭࢉػΛ্ཱͪ͛
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) ΫϥυΛ׆༻ͯ͠Λղܾ͢Δ #
·ͱΊ: Ϋϥυར༻ʹ͓͚Δ՝" # ετϨʔδͷίετ" # ܭࢉ࣌ߴͳI/OΛཁٻ" # อ࣌ίετͳετϨʔδ" # (༻Ϋϥυͷ߹) ݚڀඅͰͷࢧ͍" # ݸਓใΛؚΉσʔλͷѻ͍" # ҆શੑͷཱ֬ - ར༻࣮ͷੵ" # ΨΠυϥΠϯͷࡦఆ
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) େֶපӃͰήϊϜใ͕࣍ʑʹ! AMEDͳͲͷػߏʹΑͬͯήϊϜ͕ਪਐ͞Ε͍ͯΔ
Secure cloud computing for genomic data! Datta, Somalee, Keith Bettinger,
and Michael Snyder. "Secure cloud computing for genomic data." Nature Biotechnology 34.6 (2016): 588-591.! ήϊϜσʔλղੳʹΫϥυΛ༻͍Δ͋ͨΊʹඞཁͳηΩϡϦςΟ ݚڀػؔͱΫϥυϓϩόΠμͷ࿈ܞʹΑͬͯ͞ΕΔඞཁ͕͋Δ
Secure cloud computing for genomic data! Datta, Somalee, Keith Bettinger,
and Michael Snyder. "Secure cloud computing for genomic data." Nature Biotechnology 34.6 (2016): 588-591.! # Security requirements" # The data privacy agreement / σʔλͷऔѻʹ͍ͭͯͷݚڀػؔͱͷ߹ҙ" # Physical and logical security / ཧ/ཧͰͷηΩϡϦςΟ" # Encryption data / σʔλͷอ/సૹ࣌ͷ҉߸Խ" # Authentication / Ϣʔβೝূ " # Principle of Least Privilege / ࠷খݖݶͷݪଇ" # Firewalls / ϑΝΠϠʔΥʔϧ" # Logging and monitoring / ϩΪϯάͱϞχλϦϯά" # Training / ηΩϡϦςΟೝূʹ͍ͭͯͷτϨʔχϯά" # Security and privacy / ݸਓใͷอޢ
ݸਓใͷऔΓѻ͍ͱݚڀར༻ͷؔ! ຊܦࡁ৽ฉʮҩֶݚڀͱݸਓใͷཱ྆Λ ʯΑΓ! http://www.nikkei.com/article/DGXKZO05121060S6A720C1EA1000/ ݸਓใΛؚΉݚڀσʔλපؾͷݪҼղ໌࣏ྍʹඇৗʹॏཁ ηΩϡΞͳڥ͕͋ΕݚڀΛਪਐ͢Δେ͖ͳثʹͳΔ
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) Summary
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) Summary #
ࠓੜ໋ՊֶͷͰԿ͕ى͖͍ͯΔͷ͔" ◦ େنͳσʔλͷੵʹΑΓܭࢉػधཁ͕ߴ·͍ͬͯΔ" # ࠓͲͷΑ͏ͳܭࢉػ͕ٻΊΒΕ͍ͯΔͷ͔" ◦ ήϊϜͰετϨʔδϝϞϦ͕ॏࢹ͞ΕΔ" ◦ ར༻ऀʹΑͬͯཁٻ͕ࡉ͔͘ҧ͏" # ΫϥυΛ׆༻ͯ͠Λղܾ͍͖͍ͯͨ͠" ◦ ΫϥυͷརศੑΛ͞ΒʹߴΊ͍ͯ͘" ◦ ར༻ࣄྫΛ૿͢͜ͱ͕ॏཁ
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) ࢀߟࢿྉ
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) ϥΠϑαΠΤϯε౷߹σʔλϕʔεηϯλʔʹ͍ͭͯ #
ϥΠϑαΠΤϯεʹ͓͚Δσʔλϕʔε౷߹ʹࢿ͢Δٕज़ ։ൃΛ୲͏" # ج൫ٕज़։ൃ" # ηϚϯςΟοΫΣϒٕज़ࣗવݴޠॲཧΛ༻͍ͨϑΣσ Ϩʔγϣϯܕσʔλ౷߹ͷͨΊͷٕज़։ൃࠃࡍඪ४ͷࡦ ఆʹऔΓΉ" # DDBJ࿈ܞ" # େنήϊϜσʔλΛ࢝Ίͱ͢Δσʔλͷ׆༻ͷ ͨΊͷٕज़։ൃΛߦ͏
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) ϥΠϑαΠΤϯε౷߹σʔλϕʔεηϯλʔʹ͍ͭͯ #
JSTͷηϯλʔ NBDC ͱڞಉͰσʔλϕʔεࣄۀΛਐΊΔ" # DDBJͱಉ͡৫ (ROIS, NIIಉ͡) Ͱ࿈ܞ͍ͯ͠Δ http://dbcls.rois.ac.jp/about