Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
クラウドを活用したゲノム情報解析の現状
Search
Tazro Inutano Ohta
July 22, 2016
Research
2
420
クラウドを活用したゲノム情報解析の現状
情報処理学会 連続セミナー 2016 第2回 クラウド
http://www.ipsj.or.jp/event/seminar/2016/program02.html
Tazro Inutano Ohta
July 22, 2016
Tweet
Share
More Decks by Tazro Inutano Ohta
See All by Tazro Inutano Ohta
Yevis: System to support building a workflow registry with automated quality control
inutano
0
100
Standardization of biological sample information database
inutano
0
57
Describe data analysis workflow with workflow languages
inutano
5
4.6k
Container virtualization technologies and workflow languages improve portability and reproducibility of data analysis environment
inutano
3
320
次世代シーケンサーによるメタゲノム解析:桜の花びらに付着した環境DNAを解析する
inutano
0
76
Workflows that run everywhere and where to run them
inutano
0
130
The Sequence Read Archive search system to make use of public high-throughput sequencing data
inutano
0
250
Improve portability of bioinformatics software across HPC and cloud infrastructures
inutano
1
96
Container, Cloud, and HPC
inutano
0
150
Other Decks in Research
See All in Research
第79回 産総研人工知能セミナー 発表資料
agiats
3
190
KDD論文読み会2024: False Positive in A/B Tests
ryotoitoi
0
270
チュートリアル:Mamba, Vision Mamba (Vim)
hf149
6
2k
論文読み会 KDD2024 | Relevance meets Diversity: A User-Centric Framework for Knowledge Exploration through Recommendations
cocomoff
0
140
ニュースメディアにおける事前学習済みモデルの可能性と課題 / IBIS2024
upura
3
750
メールからの名刺情報抽出におけるLLM活用 / Use of LLM in extracting business card information from e-mails
sansan_randd
2
350
Tiaccoon: コンテナネットワークにおいて複数トランスポート方式で統一的なアクセス制御
hiroyaonoe
0
260
ダイナミックプライシング とその実例
skmr2348
3
530
Neural Fieldの紹介
nnchiba
1
550
ナレッジプロデューサーとしてのミドルマネージャー支援 - MIMIGURI「知識創造室」の事例の考察 -
chiemitaki
0
160
国際会議ACL2024参加報告
chemical_tree
1
390
言語と数理の交差点:テキストの埋め込みと構造のモデル化 (IBIS 2024 チュートリアル)
yukiar
4
1k
Featured
See All Featured
[Rails World 2023 - Day 1 Closing Keynote] - The Magic of Rails
eileencodes
33
2k
Become a Pro
speakerdeck
PRO
26
5.1k
Measuring & Analyzing Core Web Vitals
bluesmoon
5
210
Designing for humans not robots
tammielis
250
25k
Typedesign – Prime Four
hannesfritz
40
2.5k
The Cost Of JavaScript in 2023
addyosmani
46
7.2k
A designer walks into a library…
pauljervisheath
205
24k
10 Git Anti Patterns You Should be Aware of
lemiorhan
PRO
656
59k
Fontdeck: Realign not Redesign
paulrobertlloyd
82
5.3k
How to train your dragon (web standard)
notwaldorf
89
5.8k
A Modern Web Designer's Workflow
chriscoyier
693
190k
For a Future-Friendly Web
brad_frost
176
9.5k
Transcript
ΫϥυΛ׆༻ͨ͠ήϊϜใղੳͷݱঢ় 22 July 2016 | ใॲཧֶձ ࿈ଓηϛφʔ 2016 ୈ2ճ Ϋϥυ
େా ୡ! େֶڞಉར༻ػؔ๏ਓ ใɾγεςϜݚڀػߏ " σʔλαΠΤϯεڞಉར༻ج൫ࢪઃ " ϥΠϑαΠΤϯε౷߹σʔλϕʔεηϯλʔ ಛݚڀһ"
[email protected]
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS)
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) Agenda! #
1. ࠓੜ໋ՊֶͷͰԿ͕ى͖͍ͯΔͷ͔" # 2. ࠓͲͷΑ͏ͳܭࢉػ͕ٻΊΒΕ͍ͯΔͷ͔" # 3. ΫϥυΛ׆༻ͯ͠Λղܾ͢Δ
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) 1. ࠓੜ໋ՊֶͷͰԿ͕ى͖͍ͯΔͷ͔
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) ࠓੜ໋ՊֶͷͰԿ͕ى͖͍ͯΔͷ͔ #
࣮ݧػցͷਐาʹΑͬͯσʔλͷαΠζͱྔ͕૿Ճ" # ήϊϜͰʮ࣍ੈDNAγʔΫΤϯαʔʯ͕ొ" # σʔλͷੵʹΑͬͯܭࢉػੜֶ͕Μʹͳ͍ͬͯΔ" # λϯύΫཱ࣭ମߏσʔλɺը૾σʔλ" # σʔλॲཧɾղੳͷޮԽࠓͳ͓ٸ" # ΞϧΰϦζϜͷਐาΛ͍ͬͯΔ࣌ؒͳ͍" # ϋʔυΣΞͷੑೳͰΛղܾ͢Δ߹
λϯύΫཱ࣭ମߏղੳͷྫ! MEGADOCK: ౦େळࢁݚڀࣨ େ্ॿڭΒͷϓϩδΣΫτ http://www.nii.ac.jp/csi/openforum2016/track/pdf/20160526AM_TOUKOUDAI_akiyama2.pdf
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) ήϊϜՊֶͷͰԿ͕ى͖͍ͯΔͷ͔ #
࣮ݧػցͷਐาΛཚʹྫ͑ΔͳΒ…" # ւ = ήϊϜ, ڕ = Ҩࢠ" # ʮͲΜͳڕ͕͍Δ͔ௐΔ͜ͱͰւΛಛ͚ͮΔʯ" # ٕज़ͷਐาͰಓ۩ͷੑೳ্͕ͨ͠" # Γ͕ఈҾ͖ʹͳͬͨ
ࣸਅࠨ: πϦόΧϝϥ @kazzwatabe https://tsuriba.camera/posts/XQeP3qmIp6A ࣸਅӈ: photo by atramos https://www.flickr.com/photos/atramos/5508960637 ࣮ݧػց͕ਐา͢Δͱ݁Ռͷղऍʹίετ͕͔͔Δ
͜Ε·ͰͷDNAγʔέϯαʔͷग़ྗσʔλͰݟͯ֬ೝͰ͖ͨ ࠓͷDNAγʔέϯαʔͷग़ྗܭࢉػ͕ͳ͍ͱԿͰ͖ͳ͍
https://flxlexblog.wordpress.com/2014/06/11/developments-in-next-generation-sequencing-june-2014-edition/ ← ←ఈҾ͖ DNAγʔέϯα ػछ͝ͱͷੑೳൺֱ
None
http://www.ncbi.nlm.nih.gov/Traces/sra/ ެڞσʔλϨϙδτϦͷσʔλαΠζͷ৳ͼ
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) DNAγʔέϯα͔ΒಘΒΕΔσʔλ #
ʮήϊϜΛղಡ͢ΔʯͱҰݴͰݴ͏ͷͷ…" # ੜମαϯϓϧ͔ΒDNAΛநग़͢Δ" # நग़ͨ͠DNAΛ͍ࢠʹஅยԽ͢Δ" # DNAγʔέϯαͰղੳ͢Δ" # ͘அยԽ͞ΕͨԘجྻͷϦετͰग़ྗ͞ΕΔ" # େྔͷDNAஅยͷใ͔ΒݩͷDNAΛ෮ݩ͢Δ! # de novo Assemble" # Reference Alignment" "
https://speakerdeck.com/michaelbarton/ranking-genome-assemblers-with-docker-containers-dockercon-eu-2014 DNAγʔέϯα͔Βग़ྗ͞ΕΔσʔλஅยԽ͍ͯ͠Δ
https://speakerdeck.com/michaelbarton/ranking-genome-assemblers-with-docker-containers-dockercon-eu-2014 DNAγʔέϯαΛγϡϨομʔʹྫ͑Δͱ…
https://speakerdeck.com/michaelbarton/ranking-genome-assemblers-with-docker-containers-dockercon-eu-2014 ήϊϜΞηϯϒϧ = ຊͷ෮ݩ
https://speakerdeck.com/michaelbarton/ranking-genome-assemblers-with-docker-containers-dockercon-eu-2014 ήϊϜΞηϯϒϧ = ຊͷ෮ݩ
http://www.historyofnimr.org.uk/mill-hill-essays/essays-yearly-volumes/2010-2/bringing-it-all-back-home-next-generation- sequencing-technology-and-you/ ϦϑΝϨϯεΞϥΠϯϝϯτ! = खຊ (ϦϑΝϨϯε) ʹԊͬͯฒͯ෮ݩ
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) σʔλղੳιϑτΣΞ (ղੳπʔϧ)
# ଟ͘ͷղੳπʔϧ͕ΦʔϓϯιʔεͰެ։͞Ε͍ͯΔ" # ରσʔλͷੑ࣭ʹΑͬͯ࠷దͳπʔϧ͕ҟͳΔ" # σʔλղੳऀ (ੜֶऀ) ͕σʔλղੳΛߦ͏" # πʔϧ։ൃऀ(࣮ऀ)ͱར༻ऀಉҰͰͳ͍" # ར༻ऀ͕πʔϧͷڍಈΛશʹѲ͍ͯ͠ΔͱݶΒͳ͍" # ղੳऀৗʹσʔλղੳΛ͍ͯ͠ΔΘ͚Ͱͳ͍" # ੜ࣮ݧͷยखؒʹղੳΛ͢Δݚڀऀଟ͍
ΦʔϓϯιʔεͰެ։͞Ε࣮ͨΛ༻͍ͯղੳ https://omictools.com/de-novo-genome-sequencing-category
ΦʔϓϯιʔεͰެ։͞Ε࣮ͨΛ༻͍ͯղੳ https://omictools.com/whole-genome-resequencing-category
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) ࠓੜ໋ՊֶͷͰԿ͕ى͖͍ͯΔͷ͔! #
·ͱΊ" # σʔλͷྔͱ͕ٸܹʹ૿͓͑ͯΓɺࠓޙ૿͑Δ" # తʹΑͬͯҟͳΔπʔϧɾΞϧΰϦζϜ͕༻͞ΕΔ" # σʔλղੳऀͱπʔϧ։ൃऀ(࣮ऀ)ҟͳΔ͜ͱ͕ଟ͍
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) 2. ࠓͲͷΑ͏ͳܭࢉػ͕ٻΊΒΕ͍ͯΔͷ͔
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) ࠓͲͷΑ͏ͳܭࢉػ͕ΘΕ͍ͯΔͷ͔ #
PC" # PCΫϥελ" # ڌεύίϯ" # ࠃཱҨֶݚڀॴ εʔύʔίϯϐϡʔλγεςϜ
࣍ੈγʔΫΤϯαʔ%3:ղੳڭຊ ࡉ๔ֶผ ΑΓ ڭຊʹMacΛങ͑ͱॻ͍ͯ͋Δ
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) ͲͷΑ͏ͳܭࢉػ͕ٻΊΒΕΔͷ͔ #
ରσʔλ͕େ͖͘ͳΔ/૿͑Δͱ௨ৗͷPCͰݫ͍͠" # ղੳσʔλ͕ͲΜͲΜཷ·Δ" # ಡΈॻ͖͕ߴͰڊେͳετϨʔδ! # πʔϧ͕Out of memoryͰམͪΔ" # େنϝϞϦ! # όονॲཧΛେྔͷαϯϓϧʹର࣮ͯ͠ߦ͢Δ" # ࢄ࣮ߦδϣϒεέδϡʔϦϯάγεςϜ! # େܕڞ༻ܭࢉػͷཁٻͷߴ·Γ" # ҨֶݚڀॴSCͷಋೖ (2012~) => ·ͩेͰͳ͍
େֶڞಉར༻ػؔ๏ਓ ใɾγεςϜݚڀػߏ ࠃཱҨֶݚڀॴ SuperComputer Facilities of National Institute of Genetics
photo from http://sc.ddbj.nig.ac.jp/index.php/ja-gallery
None
૿͑ଓ͚ΔϢʔβ ҨݚDDBJηϯλʔ খּݪ͞ΜͷൃදࢿྉΑΓ
ṧഭ͢ΔσΟεΫ https://sc.ddbj.nig.ac.jp/index.php/ja-nig-statistics
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) ݱͰԿ͕ϘτϧωοΫͳͷ͔! εύίϯϢʔβձͳͲͷώΞϦϯάΑΓ
# ܭࢉػʹෆ׳ΕͳϢʔβͷΈ" # ܭࢉػ͝ͱʹԿ͕Ͱ͖ͯԿ͕Ͱ͖ͳ͍ͷ͔Θ͔Βͳ͍" # େنͳܭࢉػΛඞཁͱ͢Δ͕CUI͕͑ͳ͍" # ܭࢉػΛ͍͜ͳ͢ਓͷΈ" # ܭࢉػ͕ࠞΜͰ͍ͯδϣϒ͕ྲྀͤͳ͍" # σʔλͷղੳอଘʹेʹ༧ࢉΛೖͰ͖ͳ͍! # ڥߏஙʹίετ͕͔͔Δ" # ܭࢉػͷ໘Λݟͨ͘ͳ͍
ʮੜ࣮ݧʹ͓͕͔͔ۚΔ͕ɺ ɹσʔλղੳʹͦΕ΄Ͳ͓͕͔͔ۚΒͳ͍ʯͱࢥΘΕ͍ͯΔ http://trattoriainutano.tumblr.com/post/132214903857/
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) ͲͷΑ͏ͳܭࢉػ͕ٻΊΒΕ͍ͯΔͷ͔ #
·ͱΊ" # ରσʔλͱతʹΑͬͯཁٻʹ͕ࠩ͋Δ" # ήϊϜͰετϨʔδϝϞϦͷΈ͕ਂࠁ" # ϢʔβͷܭࢉػϦςϥγʹ෯͕͋Δ" # ϢʔβͷϨϕϧʹΑͬͯٻΊΔϨΠϠʔ͕ҧ͏
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) 3. ΫϥυΛ׆༻ͯ͠Λղܾ͢Δ
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) ΫϥυΛ׆༻ͯ͠Λղܾ͢Δ #
ΫϥυͰղܾͰ͖Δ" # ಋೖίετ" # ϊʔυͷࠞࡶ" # ϝϯςφϯείετ" # Ϋϥυར༻ʹ͓͚Δ՝" # ετϨʔδͷίετ" # ݚڀඅͰͷࢧ͍" # ະൃදσʔλ / ݸਓใΛؚΉσʔλͷѻ͍
Ϋϥυ׆༻ࣄྫ (SaaS)! Google Genomics https://cloud.google.com/genomics/v1/analyze-variants
Ϋϥυ׆༻ࣄྫ (IaaS)! 1000ਓήϊϜσʔλ on AWS https://aws.amazon.com/jp/1000genomes/
The NIH Commons! ถࠃͰϑΝϯσΟϯάଆ͕Ϋϥυར༻Λଅਐ “The Commons is a shared virtual
space where scientists can work with the digital objects of biomedical research, i.e. it is a system that will allow investigators to find, manage, share, use and reuse data, software, metadata and workflows.” - https://datascience.nih.gov/ commons
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) Ϋϥυ׆༻ࣄྫ (PaaS/SaaS)!
ήϊϜղੳύΠϓϥΠϯ on ΞΧσϛοΫɾΠϯλʔΫϥυ # JST CREST: ΠϯλʔΫϥυΛ׆༻ͨ͠ΞϓϦέʔγϣϯத৺ܕΦʔόʔ ϨΠΫϥυٕज़ʹؔ͢Δݚڀ (ද: NII߹ాઌੜ)" # ΞΧσϛοΫɾΠϯλʔΫϥυͷࢼΈ" # ҨݚεύίϯΛใݚΫϥυଞࠃͷΞΧσϛοΫΫϥυͱ࿈ܞ" # ղੳʹ༻͍ΒΕΔ֤πʔϧΛDockerԽ͢Δ͜ͱͰΞϓϦέʔγϣϯΛ ϙʔλϒϧʹ" # ༧ΊπʔϧΛΈ߹ΘͤͨϫʔΫϑϩʔΛߏங͠GUIΛఏڙ" # ղੳσʔλ͝ͱʹ࠷దͳϦιʔεΛׂΓͯͨܭࢉػΛ্ཱͪ͛
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) ΫϥυΛ׆༻ͯ͠Λղܾ͢Δ #
·ͱΊ: Ϋϥυར༻ʹ͓͚Δ՝" # ετϨʔδͷίετ" # ܭࢉ࣌ߴͳI/OΛཁٻ" # อ࣌ίετͳετϨʔδ" # (༻Ϋϥυͷ߹) ݚڀඅͰͷࢧ͍" # ݸਓใΛؚΉσʔλͷѻ͍" # ҆શੑͷཱ֬ - ར༻࣮ͷੵ" # ΨΠυϥΠϯͷࡦఆ
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) େֶපӃͰήϊϜใ͕࣍ʑʹ! AMEDͳͲͷػߏʹΑͬͯήϊϜ͕ਪਐ͞Ε͍ͯΔ
Secure cloud computing for genomic data! Datta, Somalee, Keith Bettinger,
and Michael Snyder. "Secure cloud computing for genomic data." Nature Biotechnology 34.6 (2016): 588-591.! ήϊϜσʔλղੳʹΫϥυΛ༻͍Δ͋ͨΊʹඞཁͳηΩϡϦςΟ ݚڀػؔͱΫϥυϓϩόΠμͷ࿈ܞʹΑͬͯ͞ΕΔඞཁ͕͋Δ
Secure cloud computing for genomic data! Datta, Somalee, Keith Bettinger,
and Michael Snyder. "Secure cloud computing for genomic data." Nature Biotechnology 34.6 (2016): 588-591.! # Security requirements" # The data privacy agreement / σʔλͷऔѻʹ͍ͭͯͷݚڀػؔͱͷ߹ҙ" # Physical and logical security / ཧ/ཧͰͷηΩϡϦςΟ" # Encryption data / σʔλͷอ/సૹ࣌ͷ҉߸Խ" # Authentication / Ϣʔβೝূ " # Principle of Least Privilege / ࠷খݖݶͷݪଇ" # Firewalls / ϑΝΠϠʔΥʔϧ" # Logging and monitoring / ϩΪϯάͱϞχλϦϯά" # Training / ηΩϡϦςΟೝূʹ͍ͭͯͷτϨʔχϯά" # Security and privacy / ݸਓใͷอޢ
ݸਓใͷऔΓѻ͍ͱݚڀར༻ͷؔ! ຊܦࡁ৽ฉʮҩֶݚڀͱݸਓใͷཱ྆Λ ʯΑΓ! http://www.nikkei.com/article/DGXKZO05121060S6A720C1EA1000/ ݸਓใΛؚΉݚڀσʔλපؾͷݪҼղ໌࣏ྍʹඇৗʹॏཁ ηΩϡΞͳڥ͕͋ΕݚڀΛਪਐ͢Δେ͖ͳثʹͳΔ
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) Summary
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) Summary #
ࠓੜ໋ՊֶͷͰԿ͕ى͖͍ͯΔͷ͔" ◦ େنͳσʔλͷੵʹΑΓܭࢉػधཁ͕ߴ·͍ͬͯΔ" # ࠓͲͷΑ͏ͳܭࢉػ͕ٻΊΒΕ͍ͯΔͷ͔" ◦ ήϊϜͰετϨʔδϝϞϦ͕ॏࢹ͞ΕΔ" ◦ ར༻ऀʹΑͬͯཁٻ͕ࡉ͔͘ҧ͏" # ΫϥυΛ׆༻ͯ͠Λղܾ͍͖͍ͯͨ͠" ◦ ΫϥυͷརศੑΛ͞ΒʹߴΊ͍ͯ͘" ◦ ར༻ࣄྫΛ૿͢͜ͱ͕ॏཁ
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) ࢀߟࢿྉ
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) ϥΠϑαΠΤϯε౷߹σʔλϕʔεηϯλʔʹ͍ͭͯ #
ϥΠϑαΠΤϯεʹ͓͚Δσʔλϕʔε౷߹ʹࢿ͢Δٕज़ ։ൃΛ୲͏" # ج൫ٕज़։ൃ" # ηϚϯςΟοΫΣϒٕज़ࣗવݴޠॲཧΛ༻͍ͨϑΣσ Ϩʔγϣϯܕσʔλ౷߹ͷͨΊͷٕज़։ൃࠃࡍඪ४ͷࡦ ఆʹऔΓΉ" # DDBJ࿈ܞ" # େنήϊϜσʔλΛ࢝Ίͱ͢Δσʔλͷ׆༻ͷ ͨΊͷٕज़։ൃΛߦ͏
Licensed under CC-BY 4.0 ©2016 Tazro Ohta (DBCLS) ϥΠϑαΠΤϯε౷߹σʔλϕʔεηϯλʔʹ͍ͭͯ #
JSTͷηϯλʔ NBDC ͱڞಉͰσʔλϕʔεࣄۀΛਐΊΔ" # DDBJͱಉ͡৫ (ROIS, NIIಉ͡) Ͱ࿈ܞ͍ͯ͠Δ http://dbcls.rois.ac.jp/about