Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Public data repository and analysis pipeline fo...
Search
Tazro Inutano Ohta
July 11, 2014
Science
0
350
Public data repository and analysis pipeline for high-throughput sequencing
特定非営利活動法人酵母細胞研究会 第186回例会 次世代シーケンサーを活用した研究事例と、それを支える公共ツール・データベース
Tazro Inutano Ohta
July 11, 2014
Tweet
Share
More Decks by Tazro Inutano Ohta
See All by Tazro Inutano Ohta
Yevis: System to support building a workflow registry with automated quality control
inutano
0
140
Standardization of biological sample information database
inutano
0
86
Describe data analysis workflow with workflow languages
inutano
5
5.8k
Container virtualization technologies and workflow languages improve portability and reproducibility of data analysis environment
inutano
3
360
次世代シーケンサーによるメタゲノム解析:桜の花びらに付着した環境DNAを解析する
inutano
0
120
Workflows that run everywhere and where to run them
inutano
0
170
The Sequence Read Archive search system to make use of public high-throughput sequencing data
inutano
0
310
Improve portability of bioinformatics software across HPC and cloud infrastructures
inutano
1
130
Container, Cloud, and HPC
inutano
0
190
Other Decks in Science
See All in Science
Text-to-SQLの既存の評価指標を問い直す
gotalab555
1
170
Amusing Abliteration
ianozsvald
0
100
機械学習 - K-means & 階層的クラスタリング
trycycle
PRO
0
1.2k
ド文系だった私が、 KaggleのNCAAコンペでソロ金取れるまで
wakamatsu_takumu
2
1.9k
データベース05: SQL(2/3) 結合質問
trycycle
PRO
0
880
Hakonwa-Quaternion
hiranabe
1
170
Performance Evaluation and Ranking of Drivers in Multiple Motorsports Using Massey’s Method
konakalab
0
140
機械学習 - ニューラルネットワーク入門
trycycle
PRO
0
940
Kaggle: NeurIPS - Open Polymer Prediction 2025 コンペ 反省会
calpis10000
0
380
データベース11: 正規化(1/2) - 望ましくない関係スキーマ
trycycle
PRO
0
1.1k
NDCG is NOT All I Need
statditto
2
2.8k
先端因果推論特別研究チームの研究構想と 人間とAIが協働する自律因果探索の展望
sshimizu2006
3
770
Featured
See All Featured
Embracing the Ebb and Flow
colly
88
5k
4 Signs Your Business is Dying
shpigford
187
22k
Kristin Tynski - Automating Marketing Tasks With AI
techseoconnect
PRO
0
150
Designing for Performance
lara
610
70k
Bootstrapping a Software Product
garrettdimon
PRO
307
120k
Measuring Dark Social's Impact On Conversion and Attribution
stephenakadiri
1
130
The Illustrated Children's Guide to Kubernetes
chrisshort
51
51k
How Fast Is Fast Enough? [PerfNow 2025]
tammyeverts
3
460
Joys of Absence: A Defence of Solitary Play
codingconduct
1
290
Collaborative Software Design: How to facilitate domain modelling decisions
baasie
0
140
A Soul's Torment
seathinner
5
2.3k
I Don’t Have Time: Getting Over the Fear to Launch Your Podcast
jcasabona
34
2.6k
Transcript
࣍ੈγʔέϯαʔΛར༻ͨ͠ݚڀࣄྫͱͦΕΛࢧ͑Δެڞπʔϧɾσʔλϕʔε Public data repository and analysis pipeline for high-throughput sequencing
ใɾγεςϜݚڀػߏ ϥΠϑαΠΤϯε౷߹σʔλϕʔεηϯλʔ େా ୡ <
[email protected]
> ! prepared for ୈ186ճ ߬ࡉ๔ݚڀձ ྫձ July 11, 2014
Agenda ‣ %#$-4ͱ౷߹%#ϓϩδΣΫτʹ͍ͭͯ ‣ /(4ʹؔ࿈͢Δσʔλϕʔε ‣ /(4Λͬͨݚڀϑϩʔʹ͓͚Δެ։%#ͷׂ ‣ ެڞσʔλͷݕࡧ͔Βղੳ·Ͱ
DBCLSͱ౷߹σʔλϕʔεϓϩδΣΫτʹ͍ͭͯ Database Integration Project and DBCLS
DBCLSͱ౷߹σʔλϕʔεϓϩδΣΫτʹ͍ͭͯ ‣ େֶڞಉར༻ػؔ๏ਓใɾγεςϜݚڀػߏ 30*4 ࡿԼ ‣ +45ࡿԼͷ/#%$ ಉ͘͡30*4ࡿԼͷҨݚ%%#+ͱ࿈ܞ ‣ /#%$ϑΝϯσΟϯάɼ%%#+σʔλΞʔΧΠϒɼ%#$-4ٕज़։ൃΛ୲
ϥΠϑαΠΤϯε౷߹σʔλϕʔεηϯλʔ %#$-4 ɺੜ໋Պֶʹ͓͚Δ σʔλެ։ͷଅਐͱσʔλϕʔεߏஙʹࢿ͢Δٕज़ͷݚڀ։ൃΛߦ͏ݚڀॴͰ͢ɻ
http://dbcls.rois.ac.jp/about
DBCLSͱ౷߹ϓϩδΣΫτ: ͜Ε·Ͱʹ։ൃɾӡ༻͖ͯͨ͠αʔϏε ‣ *OUFHCJPσʔλϕʔεΧλϩά ‣ ੜ໋ՊֶσʔλϕʔεΞʔΧΠϒ ‣ ͦͷଞɼσʔλϕʔεԣஅݕࡧͳͲ%#౷߹ʹࢿ͢ΔαʔϏε ‣ ݸผʹݚڀ։ൃΛߦ͍ͬͯΔٕज़ͷԠ༻ͱͯ͠ͷαʔϏε
‣ UPHPHFOPNF ((3/" 3FG&Y ৽ணจϨϏϡʔ ౷߹57 *O.F9FT FUD
Database of Databases: Integbio DBcatalog http://integbio.jp/dbcatalog
ੜछΧςΰϦʹΑΔߜࠐ͕Մೳ http://integbio.jp/dbcatalog
DBͷҡ࣋ɼҾ͖ड͚·͢ http://dbarchive.biosciencedbc.jp/
ҰׅDLར༻ڐཧΛαϙʔτ http://dbarchive.biosciencedbc.jp/
Find more at http://biosciencedbc.jp
togogenome.org ggrna.dbcls.jp refex.dbcls.jp first.lifesciencedb.jp togotv.dbcls.jp docman.dbcls.jp/im
togogenome.org ggrna.dbcls.jp refex.dbcls.jp first.lifesciencedb.jp togotv.dbcls.jp docman.dbcls.jp/im ήϊϜใ/ՄࢹԽ ߴԘجྻݕࡧ ҨࢠൃݱϦϑΝϨϯε ຊޠจϨϏϡʔ
ಈըνϡʔτϦΞϧ จࣥචαϙʔτ
Find more at http://dbcls.rois.ac.jp/services
NGSʹؔ࿈͢Δσʔλϕʔε Data Repositories and Databases for high-throughput sequencing
NGSʹؔ࿈͢Δσʔλϕʔεɾެ։σʔλϨϙδτϦ ‣ ࠃࡍԘجྻσʔλϕʔεͱ4FRVFODF3FBE"SDIJWF ‣ ڊେϓϩδΣΫτʹΑΔσʔλϗεςΟϯά
ࠃࡍԘجྻσʔλϕʔεͱSequence Read Archive ‣ */4%$*OU`M/VDMFPUJEF4FRVFODF%BUBCBTF$PMMBCPSBUJPO ‣ /$#* &#* %%#+ہͷ୲νʔϜ͕ڞಉͰӡ༻ ‣
4FRVFODF3FBE"SDIJWF/(4ͷͨΊͷ1SJNBSZEBUBSFQP www.insdc.org
ڊେϓϩδΣΫτʹΑΔσʔλϗεςΟϯά ‣ نͷେ͖ͳϓϩδΣΫτͰࣗΒσʔλΛެ։͢Δ߹͕͋Δ ‣ (FOPNFT1SPKFDUIUUQHFOPNFTPSH ‣ 5IF$BODFS(FOPNF"UMBT1SPKFDUIUUQUDHBEBUBODJOJIHPW ‣ &/$0%&1SPKFDUIUUQHFOPNFVDTDFEVFODPEF ‣
σʔλͷίϐʔ͕ΫϥυαʔϏε্ʹެ։͞Ε͍ͯΔ͜ͱ ‣ HFOPNFTPO"84IUUQBXTBNB[PODPNHFOPNFT
σʔλͱσʔλϕʔεͷ֊ʹ͍ͭͯ Knowledge Summarised Data Experimental Data Knowledge-base Database Primary Data
Repository Biological Information “Database”
NGSʹؔ࿈͢Δσʔλϕʔε Knowledge-base Database Primary Data Repository
NGSʹؔ࿈͢Δσʔλϕʔε Knowledge-base Database Primary Data Repository
NGSΛͬͨݚڀϑϩʔʹ͓͚Δެ։DBͷׂ The role of database for each steps of sequencing
research procedure
ҰൠతͳNGSΛ༻͍ͨݚڀϑϩʔ ࣮ݧσβΠϯ ༧උ࣮ݧ αϯϓϦϯά DNAௐ ϥΠϒϥϦ࡞ γʔέϯε QC ϑΟϧλϦϯά alignment/assemble
QC తผղੳ ֬ೝ࣮ݧ σʔλެ։ จߘ ϦόΠζ/Ճ࣮ݧ ΞΫηϓτ ҿΈձ
ެڞσʔλϕʔε͕ؔΘΔεςοϓ ࣮ݧσβΠϯ ༧උ࣮ݧ αϯϓϦϯά DNAௐ ϥΠϒϥϦ࡞ γʔέϯε QC ϑΟϧλϦϯά alignment/assemble
QC తผղੳ ֬ೝ࣮ݧ σʔλެ։ จߘ ϦόΠζ/Ճ࣮ݧ ΞΫηϓτ ҿΈձ
ެڞσʔλΛར༻ͨ͠NGSݚڀͷσβΠϯ ‣ γʔέϯεલͷ࣮ݧσβΠϯ ‣ ྨࣅσʔλΛղੳ͢Δ͜ͱͰγʔέϯεޙͷྲྀΕΛςετ͢Δ ‣ γʔέϯεޙɺσʔλղੳͰ ‣ γʔέϯε݁ՌͷଥੑΛݕ౼͢Δ ‣
ࣗલͷσʔλͱൺֱղੳΛߦ͏ ‣ σʔλղੳޙɺՌൃදͷͰ ‣ σʔλΛϨϙδτϦʹެ։͢Δ
ެڞσʔλͷݕࡧ͔Βղੳ·Ͱ Search, Download, and Data Analysis of Public Sequencing Data
ެڞσʔλͷμϯϩʔυ͔Βղੳ·Ͱ ‣ ϨϙδτϦͷݕࡧػೳͰ୳͢ ‣ /$#* &#* %%#+ͷݕࡧΛར༻͢Δ ‣ σʔλͷ*%͕ࣄલʹ͔͍ͬͯΔ߹ʹ༗ޮ ‣
จ࣬ױͳͲͷؔ࿈ใ͔Β୳͢ ‣ %#$-443"Λར༻͢Δ ‣ ެڞͷղੳαʔϏεΛར༻ͯ͠ղੳ͢Δ ‣ %%#+3FBE"OOPUBUJPO1JQFMJOF ‣ .VEJ.VUBUJPO%JTDPWFSZJOZFBTU
ϨϙδτϦͷݕࡧػೳͷ͍ํ - github.com/inutano/sra_metadata_toolkit/wiki
DBCLS SRAΛར༻͢Δ - http://sra.dbcls.jp
DBCLS SRAΛར༻͢Δ - http://sra.dbcls.jp
จ͔Β୳͢ - http://sra.dbcls.jp/cgi-bin/publication.cgi
Ωʔϫʔυશจݕࡧ - http://sra.dbcls.jp/search
Ωʔϫʔυશจݕࡧ - http://sra.dbcls.jp/search
Ωʔϫʔυશจݕࡧ - http://sra.dbcls.jp/search
Ωʔϫʔυશจݕࡧ - http://sra.dbcls.jp/search
Ωʔϫʔυશจݕࡧ - http://sra.dbcls.jp/search
ެڞNGSղੳύΠϓϥΠϯ DDBJ Read Annotation Pipeline - http://p.ddbj.nig.ac.jp
ެڞNGSղੳύΠϓϥΠϯ DDBJ Read Annotation Pipeline - http://p.ddbj.nig.ac.jp
͍ํDDBJߨशձͰ (ࢿྉըެ։͞Ε͍ͯ·͢) http://www.ddbj.nig.ac.jp/ddbjing/
Mudi: Mutation discovery in yeast - http://naoii.nig.ac.jp/mudi_top.html
Mudi: Mutation discovery in yeast - http://naoii.nig.ac.jp/mudi_top.html
Summary ‣ %#$-4ͱ౷߹%#ϓϩδΣΫτࠃͷੜ໋ՊֶϦιʔεΛ උɾ౷߹͍ͯ͠·͢ ‣ ެڞ%#Ͱެ։͞ΕͨσʔλΛ༗ޮʹར༻͢Δ͜ͱͰ ݚڀϑϩʔͷޮԽΛਤΕ·͢ ‣ ݕࡧղੳʹެڞαʔϏεΛར༻͢Δ͜ͱͰ σʔλղੳͷίετԽ͕ਤΕ·͢
Thank you! ͝ਗ਼ௌ͋Γ͕ͱ͏͍͟͝·ͨ͠ !
[email protected]
http://speakerdeck.com/inutano