Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
クラウドを活用したゲノム情報解析の現状
Search
Tazro Inutano Ohta
July 22, 2016
Research
2
420
クラウドを活用したゲノム情報解析の現状
情報処理学会 連続セミナー 2016 第2回 クラウド
http://www.ipsj.or.jp/event/seminar/2016/program02.html
Tazro Inutano Ohta
July 22, 2016
Tweet
Share
More Decks by Tazro Inutano Ohta
See All by Tazro Inutano Ohta
Yevis: System to support building a workflow registry with automated quality control
inutano
0
100
Standardization of biological sample information database
inutano
0
54
Describe data analysis workflow with workflow languages
inutano
5
4.6k
Container virtualization technologies and workflow languages improve portability and reproducibility of data analysis environment
inutano
3
320
次世代シーケンサーによるメタゲノム解析:桜の花びらに付着した環境DNAを解析する
inutano
0
76
Workflows that run everywhere and where to run them
inutano
0
130
The Sequence Read Archive search system to make use of public high-throughput sequencing data
inutano
0
250
Improve portability of bioinformatics software across HPC and cloud infrastructures
inutano
1
94
Container, Cloud, and HPC
inutano
0
150
Other Decks in Research
See All in Research
新規のC言語処理系を実装することによる 組込みシステム研究にもたらす価値 についての考察
zacky1972
1
270
データサイエンティストをめぐる環境の違い 2024年版〈一般ビジネスパーソン調査の国際比較〉
datascientistsociety
PRO
0
780
PetiteSRE_GenAIEraにおけるインフラのあり方観察
ichichi
0
190
2024/10/30 産総研AIセミナー発表資料
keisuke198619
1
380
論文紹介: COSMO: A Large-Scale E-commerce Common Sense Knowledge Generation and Serving System at Amazon (SIGMOD 2024)
ynakano
1
200
The many faces of AI and the role of mathematics
gpeyre
1
1.4k
Zipf 白色化:タイプとトークンの区別がもたらす良質な埋め込み空間と損失関数
eumesy
PRO
8
1k
[依頼講演] 適応的実験計画法に基づく効率的無線システム設計
k_sato
0
170
精度を無視しない推薦多様化の評価指標
kuri8ive
1
290
言語と数理の交差点:テキストの埋め込みと構造のモデル化 (IBIS 2024 チュートリアル)
yukiar
4
920
Weekly AI Agents News! 11月号 プロダクト/ニュースのアーカイブ
masatoto
0
200
論文読み会 KDD2024 | Relevance meets Diversity: A User-Centric Framework for Knowledge Exploration through Recommendations
cocomoff
0
110
Featured
See All Featured
Fashionably flexible responsive web design (full day workshop)
malarkey
405
66k
Large-scale JavaScript Application Architecture
addyosmani
510
110k
Facilitating Awesome Meetings
lara
50
6.1k
Site-Speed That Sticks
csswizardry
2
190
Understanding Cognitive Biases in Performance Measurement
bluesmoon
26
1.5k
The Success of Rails: Ensuring Growth for the Next 100 Years
eileencodes
44
6.9k
CSS Pre-Processors: Stylus, Less & Sass
bermonpainter
356
29k
Helping Users Find Their Own Way: Creating Modern Search Experiences
danielanewman
29
2.3k
Writing Fast Ruby
sferik
628
61k
Optimising Largest Contentful Paint
csswizardry
33
3k
GitHub's CSS Performance
jonrohan
1030
460k
4 Signs Your Business is Dying
shpigford
181
21k
Transcript
ΫϥυΛ׆༻ͨ͠ήϊϜใղੳͷݱঢ় 22 July 2016 | ใॲཧֶձ ࿈ଓηϛφʔ 2016 ୈ2ճ Ϋϥυ
େా ୡ! େֶڞಉར༻ػؔ๏ਓ ใɾγεςϜݚڀػߏ " σʔλαΠΤϯεڞಉར༻ج൫ࢪઃ " ϥΠϑαΠΤϯε౷߹σʔλϕʔεηϯλʔ ಛݚڀһ"
[email protected]
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS)
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) Agenda! #
1. ࠓੜ໋ՊֶͷͰԿ͕ى͖͍ͯΔͷ͔" # 2. ࠓͲͷΑ͏ͳܭࢉػ͕ٻΊΒΕ͍ͯΔͷ͔" # 3. ΫϥυΛ׆༻ͯ͠Λղܾ͢Δ
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) 1. ࠓੜ໋ՊֶͷͰԿ͕ى͖͍ͯΔͷ͔
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) ࠓੜ໋ՊֶͷͰԿ͕ى͖͍ͯΔͷ͔ #
࣮ݧػցͷਐาʹΑͬͯσʔλͷαΠζͱྔ͕૿Ճ" # ήϊϜͰʮ࣍ੈDNAγʔΫΤϯαʔʯ͕ొ" # σʔλͷੵʹΑͬͯܭࢉػੜֶ͕Μʹͳ͍ͬͯΔ" # λϯύΫཱ࣭ମߏσʔλɺը૾σʔλ" # σʔλॲཧɾղੳͷޮԽࠓͳ͓ٸ" # ΞϧΰϦζϜͷਐาΛ͍ͬͯΔ࣌ؒͳ͍" # ϋʔυΣΞͷੑೳͰΛղܾ͢Δ߹
λϯύΫཱ࣭ମߏղੳͷྫ! MEGADOCK: ౦େळࢁݚڀࣨ େ্ॿڭΒͷϓϩδΣΫτ http://www.nii.ac.jp/csi/openforum2016/track/pdf/20160526AM_TOUKOUDAI_akiyama2.pdf
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) ήϊϜՊֶͷͰԿ͕ى͖͍ͯΔͷ͔ #
࣮ݧػցͷਐาΛཚʹྫ͑ΔͳΒ…" # ւ = ήϊϜ, ڕ = Ҩࢠ" # ʮͲΜͳڕ͕͍Δ͔ௐΔ͜ͱͰւΛಛ͚ͮΔʯ" # ٕज़ͷਐาͰಓ۩ͷੑೳ্͕ͨ͠" # Γ͕ఈҾ͖ʹͳͬͨ
ࣸਅࠨ: πϦόΧϝϥ @kazzwatabe https://tsuriba.camera/posts/XQeP3qmIp6A ࣸਅӈ: photo by atramos https://www.flickr.com/photos/atramos/5508960637 ࣮ݧػց͕ਐา͢Δͱ݁Ռͷղऍʹίετ͕͔͔Δ
͜Ε·ͰͷDNAγʔέϯαʔͷग़ྗσʔλͰݟͯ֬ೝͰ͖ͨ ࠓͷDNAγʔέϯαʔͷग़ྗܭࢉػ͕ͳ͍ͱԿͰ͖ͳ͍
https://flxlexblog.wordpress.com/2014/06/11/developments-in-next-generation-sequencing-june-2014-edition/ ← ←ఈҾ͖ DNAγʔέϯα ػछ͝ͱͷੑೳൺֱ
None
http://www.ncbi.nlm.nih.gov/Traces/sra/ ެڞσʔλϨϙδτϦͷσʔλαΠζͷ৳ͼ
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) DNAγʔέϯα͔ΒಘΒΕΔσʔλ #
ʮήϊϜΛղಡ͢ΔʯͱҰݴͰݴ͏ͷͷ…" # ੜମαϯϓϧ͔ΒDNAΛநग़͢Δ" # நग़ͨ͠DNAΛ͍ࢠʹஅยԽ͢Δ" # DNAγʔέϯαͰղੳ͢Δ" # ͘அยԽ͞ΕͨԘجྻͷϦετͰग़ྗ͞ΕΔ" # େྔͷDNAஅยͷใ͔ΒݩͷDNAΛ෮ݩ͢Δ! # de novo Assemble" # Reference Alignment" "
https://speakerdeck.com/michaelbarton/ranking-genome-assemblers-with-docker-containers-dockercon-eu-2014 DNAγʔέϯα͔Βग़ྗ͞ΕΔσʔλஅยԽ͍ͯ͠Δ
https://speakerdeck.com/michaelbarton/ranking-genome-assemblers-with-docker-containers-dockercon-eu-2014 DNAγʔέϯαΛγϡϨομʔʹྫ͑Δͱ…
https://speakerdeck.com/michaelbarton/ranking-genome-assemblers-with-docker-containers-dockercon-eu-2014 ήϊϜΞηϯϒϧ = ຊͷ෮ݩ
https://speakerdeck.com/michaelbarton/ranking-genome-assemblers-with-docker-containers-dockercon-eu-2014 ήϊϜΞηϯϒϧ = ຊͷ෮ݩ
http://www.historyofnimr.org.uk/mill-hill-essays/essays-yearly-volumes/2010-2/bringing-it-all-back-home-next-generation- sequencing-technology-and-you/ ϦϑΝϨϯεΞϥΠϯϝϯτ! = खຊ (ϦϑΝϨϯε) ʹԊͬͯฒͯ෮ݩ
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) σʔλղੳιϑτΣΞ (ղੳπʔϧ)
# ଟ͘ͷղੳπʔϧ͕ΦʔϓϯιʔεͰެ։͞Ε͍ͯΔ" # ରσʔλͷੑ࣭ʹΑͬͯ࠷దͳπʔϧ͕ҟͳΔ" # σʔλղੳऀ (ੜֶऀ) ͕σʔλղੳΛߦ͏" # πʔϧ։ൃऀ(࣮ऀ)ͱར༻ऀಉҰͰͳ͍" # ར༻ऀ͕πʔϧͷڍಈΛશʹѲ͍ͯ͠ΔͱݶΒͳ͍" # ղੳऀৗʹσʔλղੳΛ͍ͯ͠ΔΘ͚Ͱͳ͍" # ੜ࣮ݧͷยखؒʹղੳΛ͢Δݚڀऀଟ͍
ΦʔϓϯιʔεͰެ։͞Ε࣮ͨΛ༻͍ͯղੳ https://omictools.com/de-novo-genome-sequencing-category
ΦʔϓϯιʔεͰެ։͞Ε࣮ͨΛ༻͍ͯղੳ https://omictools.com/whole-genome-resequencing-category
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) ࠓੜ໋ՊֶͷͰԿ͕ى͖͍ͯΔͷ͔! #
·ͱΊ" # σʔλͷྔͱ͕ٸܹʹ૿͓͑ͯΓɺࠓޙ૿͑Δ" # తʹΑͬͯҟͳΔπʔϧɾΞϧΰϦζϜ͕༻͞ΕΔ" # σʔλղੳऀͱπʔϧ։ൃऀ(࣮ऀ)ҟͳΔ͜ͱ͕ଟ͍
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) 2. ࠓͲͷΑ͏ͳܭࢉػ͕ٻΊΒΕ͍ͯΔͷ͔
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) ࠓͲͷΑ͏ͳܭࢉػ͕ΘΕ͍ͯΔͷ͔ #
PC" # PCΫϥελ" # ڌεύίϯ" # ࠃཱҨֶݚڀॴ εʔύʔίϯϐϡʔλγεςϜ
࣍ੈγʔΫΤϯαʔ%3:ղੳڭຊ ࡉ๔ֶผ ΑΓ ڭຊʹMacΛങ͑ͱॻ͍ͯ͋Δ
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) ͲͷΑ͏ͳܭࢉػ͕ٻΊΒΕΔͷ͔ #
ରσʔλ͕େ͖͘ͳΔ/૿͑Δͱ௨ৗͷPCͰݫ͍͠" # ղੳσʔλ͕ͲΜͲΜཷ·Δ" # ಡΈॻ͖͕ߴͰڊେͳετϨʔδ! # πʔϧ͕Out of memoryͰམͪΔ" # େنϝϞϦ! # όονॲཧΛେྔͷαϯϓϧʹର࣮ͯ͠ߦ͢Δ" # ࢄ࣮ߦδϣϒεέδϡʔϦϯάγεςϜ! # େܕڞ༻ܭࢉػͷཁٻͷߴ·Γ" # ҨֶݚڀॴSCͷಋೖ (2012~) => ·ͩेͰͳ͍
େֶڞಉར༻ػؔ๏ਓ ใɾγεςϜݚڀػߏ ࠃཱҨֶݚڀॴ SuperComputer Facilities of National Institute of Genetics
photo from http://sc.ddbj.nig.ac.jp/index.php/ja-gallery
None
૿͑ଓ͚ΔϢʔβ ҨݚDDBJηϯλʔ খּݪ͞ΜͷൃදࢿྉΑΓ
ṧഭ͢ΔσΟεΫ https://sc.ddbj.nig.ac.jp/index.php/ja-nig-statistics
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) ݱͰԿ͕ϘτϧωοΫͳͷ͔! εύίϯϢʔβձͳͲͷώΞϦϯάΑΓ
# ܭࢉػʹෆ׳ΕͳϢʔβͷΈ" # ܭࢉػ͝ͱʹԿ͕Ͱ͖ͯԿ͕Ͱ͖ͳ͍ͷ͔Θ͔Βͳ͍" # େنͳܭࢉػΛඞཁͱ͢Δ͕CUI͕͑ͳ͍" # ܭࢉػΛ͍͜ͳ͢ਓͷΈ" # ܭࢉػ͕ࠞΜͰ͍ͯδϣϒ͕ྲྀͤͳ͍" # σʔλͷղੳอଘʹेʹ༧ࢉΛೖͰ͖ͳ͍! # ڥߏஙʹίετ͕͔͔Δ" # ܭࢉػͷ໘Λݟͨ͘ͳ͍
ʮੜ࣮ݧʹ͓͕͔͔ۚΔ͕ɺ ɹσʔλղੳʹͦΕ΄Ͳ͓͕͔͔ۚΒͳ͍ʯͱࢥΘΕ͍ͯΔ http://trattoriainutano.tumblr.com/post/132214903857/
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) ͲͷΑ͏ͳܭࢉػ͕ٻΊΒΕ͍ͯΔͷ͔ #
·ͱΊ" # ରσʔλͱతʹΑͬͯཁٻʹ͕ࠩ͋Δ" # ήϊϜͰετϨʔδϝϞϦͷΈ͕ਂࠁ" # ϢʔβͷܭࢉػϦςϥγʹ෯͕͋Δ" # ϢʔβͷϨϕϧʹΑͬͯٻΊΔϨΠϠʔ͕ҧ͏
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) 3. ΫϥυΛ׆༻ͯ͠Λղܾ͢Δ
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) ΫϥυΛ׆༻ͯ͠Λղܾ͢Δ #
ΫϥυͰղܾͰ͖Δ" # ಋೖίετ" # ϊʔυͷࠞࡶ" # ϝϯςφϯείετ" # Ϋϥυར༻ʹ͓͚Δ՝" # ετϨʔδͷίετ" # ݚڀඅͰͷࢧ͍" # ະൃදσʔλ / ݸਓใΛؚΉσʔλͷѻ͍
Ϋϥυ׆༻ࣄྫ (SaaS)! Google Genomics https://cloud.google.com/genomics/v1/analyze-variants
Ϋϥυ׆༻ࣄྫ (IaaS)! 1000ਓήϊϜσʔλ on AWS https://aws.amazon.com/jp/1000genomes/
The NIH Commons! ถࠃͰϑΝϯσΟϯάଆ͕Ϋϥυར༻Λଅਐ “The Commons is a shared virtual
space where scientists can work with the digital objects of biomedical research, i.e. it is a system that will allow investigators to find, manage, share, use and reuse data, software, metadata and workflows.” - https://datascience.nih.gov/ commons
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) Ϋϥυ׆༻ࣄྫ (PaaS/SaaS)!
ήϊϜղੳύΠϓϥΠϯ on ΞΧσϛοΫɾΠϯλʔΫϥυ # JST CREST: ΠϯλʔΫϥυΛ׆༻ͨ͠ΞϓϦέʔγϣϯத৺ܕΦʔόʔ ϨΠΫϥυٕज़ʹؔ͢Δݚڀ (ද: NII߹ాઌੜ)" # ΞΧσϛοΫɾΠϯλʔΫϥυͷࢼΈ" # ҨݚεύίϯΛใݚΫϥυଞࠃͷΞΧσϛοΫΫϥυͱ࿈ܞ" # ղੳʹ༻͍ΒΕΔ֤πʔϧΛDockerԽ͢Δ͜ͱͰΞϓϦέʔγϣϯΛ ϙʔλϒϧʹ" # ༧ΊπʔϧΛΈ߹ΘͤͨϫʔΫϑϩʔΛߏங͠GUIΛఏڙ" # ղੳσʔλ͝ͱʹ࠷దͳϦιʔεΛׂΓͯͨܭࢉػΛ্ཱͪ͛
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) ΫϥυΛ׆༻ͯ͠Λղܾ͢Δ #
·ͱΊ: Ϋϥυར༻ʹ͓͚Δ՝" # ετϨʔδͷίετ" # ܭࢉ࣌ߴͳI/OΛཁٻ" # อ࣌ίετͳετϨʔδ" # (༻Ϋϥυͷ߹) ݚڀඅͰͷࢧ͍" # ݸਓใΛؚΉσʔλͷѻ͍" # ҆શੑͷཱ֬ - ར༻࣮ͷੵ" # ΨΠυϥΠϯͷࡦఆ
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) େֶපӃͰήϊϜใ͕࣍ʑʹ! AMEDͳͲͷػߏʹΑͬͯήϊϜ͕ਪਐ͞Ε͍ͯΔ
Secure cloud computing for genomic data! Datta, Somalee, Keith Bettinger,
and Michael Snyder. "Secure cloud computing for genomic data." Nature Biotechnology 34.6 (2016): 588-591.! ήϊϜσʔλղੳʹΫϥυΛ༻͍Δ͋ͨΊʹඞཁͳηΩϡϦςΟ ݚڀػؔͱΫϥυϓϩόΠμͷ࿈ܞʹΑͬͯ͞ΕΔඞཁ͕͋Δ
Secure cloud computing for genomic data! Datta, Somalee, Keith Bettinger,
and Michael Snyder. "Secure cloud computing for genomic data." Nature Biotechnology 34.6 (2016): 588-591.! # Security requirements" # The data privacy agreement / σʔλͷऔѻʹ͍ͭͯͷݚڀػؔͱͷ߹ҙ" # Physical and logical security / ཧ/ཧͰͷηΩϡϦςΟ" # Encryption data / σʔλͷอ/సૹ࣌ͷ҉߸Խ" # Authentication / Ϣʔβೝূ " # Principle of Least Privilege / ࠷খݖݶͷݪଇ" # Firewalls / ϑΝΠϠʔΥʔϧ" # Logging and monitoring / ϩΪϯάͱϞχλϦϯά" # Training / ηΩϡϦςΟೝূʹ͍ͭͯͷτϨʔχϯά" # Security and privacy / ݸਓใͷอޢ
ݸਓใͷऔΓѻ͍ͱݚڀར༻ͷؔ! ຊܦࡁ৽ฉʮҩֶݚڀͱݸਓใͷཱ྆Λ ʯΑΓ! http://www.nikkei.com/article/DGXKZO05121060S6A720C1EA1000/ ݸਓใΛؚΉݚڀσʔλපؾͷݪҼղ໌࣏ྍʹඇৗʹॏཁ ηΩϡΞͳڥ͕͋ΕݚڀΛਪਐ͢Δେ͖ͳثʹͳΔ
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) Summary
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) Summary #
ࠓੜ໋ՊֶͷͰԿ͕ى͖͍ͯΔͷ͔" ◦ େنͳσʔλͷੵʹΑΓܭࢉػधཁ͕ߴ·͍ͬͯΔ" # ࠓͲͷΑ͏ͳܭࢉػ͕ٻΊΒΕ͍ͯΔͷ͔" ◦ ήϊϜͰετϨʔδϝϞϦ͕ॏࢹ͞ΕΔ" ◦ ར༻ऀʹΑͬͯཁٻ͕ࡉ͔͘ҧ͏" # ΫϥυΛ׆༻ͯ͠Λղܾ͍͖͍ͯͨ͠" ◦ ΫϥυͷརศੑΛ͞ΒʹߴΊ͍ͯ͘" ◦ ར༻ࣄྫΛ૿͢͜ͱ͕ॏཁ
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) ࢀߟࢿྉ
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) ϥΠϑαΠΤϯε౷߹σʔλϕʔεηϯλʔʹ͍ͭͯ #
ϥΠϑαΠΤϯεʹ͓͚Δσʔλϕʔε౷߹ʹࢿ͢Δٕज़ ։ൃΛ୲͏" # ج൫ٕज़։ൃ" # ηϚϯςΟοΫΣϒٕज़ࣗવݴޠॲཧΛ༻͍ͨϑΣσ Ϩʔγϣϯܕσʔλ౷߹ͷͨΊͷٕज़։ൃࠃࡍඪ४ͷࡦ ఆʹऔΓΉ" # DDBJ࿈ܞ" # େنήϊϜσʔλΛ࢝Ίͱ͢Δσʔλͷ׆༻ͷ ͨΊͷٕज़։ൃΛߦ͏
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) ϥΠϑαΠΤϯε౷߹σʔλϕʔεηϯλʔʹ͍ͭͯ #
JSTͷηϯλʔ NBDC ͱڞಉͰσʔλϕʔεࣄۀΛਐΊΔ" # DDBJͱಉ͡৫ (ROIS, NIIಉ͡) Ͱ࿈ܞ͍ͯ͠Δ http://dbcls.rois.ac.jp/about