$30 off During Our Annual Pro Sale. View Details »
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Sequence Read Archive: Database for High-throug...
Search
Tazro Inutano Ohta
November 20, 2012
Science
0
75
Sequence Read Archive: Database for High-throughput sequencing best practice 2012
「次世代シーケンス解析と公共データベース: Sequence Read Archiveを使い倒す」
Tazro Inutano Ohta
November 20, 2012
Tweet
Share
More Decks by Tazro Inutano Ohta
See All by Tazro Inutano Ohta
Yevis: System to support building a workflow registry with automated quality control
inutano
0
130
Standardization of biological sample information database
inutano
0
77
Describe data analysis workflow with workflow languages
inutano
5
5.6k
Container virtualization technologies and workflow languages improve portability and reproducibility of data analysis environment
inutano
3
350
次世代シーケンサーによるメタゲノム解析:桜の花びらに付着した環境DNAを解析する
inutano
0
110
Workflows that run everywhere and where to run them
inutano
0
160
The Sequence Read Archive search system to make use of public high-throughput sequencing data
inutano
0
300
Improve portability of bioinformatics software across HPC and cloud infrastructures
inutano
1
120
Container, Cloud, and HPC
inutano
0
180
Other Decks in Science
See All in Science
【RSJ2025】PAMIQ Core: リアルタイム継続学習のための⾮同期推論・学習フレームワーク
gesonanko
0
380
baseballrによるMLBデータの抽出と階層ベイズモデルによる打率の推定 / TokyoR118
dropout009
2
630
凸最適化からDC最適化まで
santana_hammer
1
330
ランサムウェア対策にも考慮したVMware、Hyper-V、Azure、AWS間のリアルタイムレプリケーション「Zerto」を徹底解説
climbteam
0
170
機械学習 - pandas入門
trycycle
PRO
0
380
LayerXにおける業務の完全自動運転化に向けたAI技術活用事例 / layerx-ai-jsai2025
shimacos
2
20k
2025-05-31-pycon_italia
sofievl
0
110
Lean4による汎化誤差評価の形式化
milano0017
1
380
安心・効率的な医療現場の実現へ ~オンプレAI & ノーコードワークフローで進める業務改革~
siyoo
0
410
HajimetenoLT vol.17
hashimoto_kei
1
110
サイコロで理解する原子核崩壊と拡散現象 〜単純化されたモデルで本質を理解する〜
syotasasaki593876
0
130
Vibecoding for Product Managers
ibknadedeji
0
110
Featured
See All Featured
Why Our Code Smells
bkeepers
PRO
340
57k
Connecting the Dots Between Site Speed, User Experience & Your Business [WebExpo 2025]
tammyeverts
10
710
Building a Scalable Design System with Sketch
lauravandoore
463
34k
GraphQLの誤解/rethinking-graphql
sonatard
73
11k
RailsConf 2023
tenderlove
30
1.3k
Become a Pro
speakerdeck
PRO
30
5.7k
Building Better People: How to give real-time feedback that sticks.
wjessup
370
20k
[SF Ruby Conf 2025] Rails X
palkan
0
470
Helping Users Find Their Own Way: Creating Modern Search Experiences
danielanewman
31
3k
Six Lessons from altMBA
skipperchong
29
4.1k
It's Worth the Effort
3n
187
29k
The Web Performance Landscape in 2024 [PerfNow 2024]
tammyeverts
12
970
Transcript
࣍ੈγʔέϯεղੳͱެڞσʔλϕʔε 4FRVFODF3FBE"SDIJWFΛ͍͢ /PWBU$&3* େాୡ5B[SP0IUB ▼
ຊͷ༰ ˓ ެڞσʔλϕʔε43"4FRVFODF3FBE"SDIJWFʹ͍ͭͯ ˓ %#$-4Ͱఏڙ͍ͯ͠Δ43"ؔ࿈αʔϏε ˓ 4FRVFODF3FBE"SDIJWFϕετϓϥΫςΟε ˓ ެڞσʔλϕʔεͱ/(4ɺ՝ͱ͜Ε͔Β
ώΞϦϯά ˓ ࣍ੈγʔέϯαʔʁ ˓ 4FRVFODF3FBE"SDIJWFʁ ˓ %#$-443"ʁ
ެڞσʔλϕʔε43"4FRVFODF3FBE"SDIJWFʹ͍ͭͯ
ެڞσʔλϕʔε43"4FRVFODF3FBE"SDIJWFʹ͍ͭͯ ˓ ʹ/$#*ʹΑͬͯ/(4σʔλͷऩू͕࢝·Δ ˓ ͔Β*/4%$ʹΑΔ4FRVFODF3FBE"SDIJWFͱͯ͠ӡӦ ˓ */4%$*OUFSOBUJPOBM/VDMFPUJEF4FRVFODF%BUBCBTF$PMMBCPSBUJPO ˓ ถࠃ/$#* Ԥभ&#*
ຊ%%#+ ˓ ొडɼݕࡧμϯϩʔυͳͲΛͦΕͧΕఏڙ ˓ ొ͞Εͨσʔλަ͞ΕɼͲ͔͜ΒͰΞΫηεՄೳ
Ͳ͔͜ΒͰಉ͡σʔλʹΞΫηεՄೳ Data ID : 000001 organism : mouse cell :
nervous cell sequencer : 454 date : 2011 12 08 >Seq_Numero_1 ATGCATGCATGCATG CATGCATGCATGCAT GCATGCATGCATGCA TGCATGCATGCATGC ATGCATGCATGCATG CATGATGCATGCATG CATGCATGCATGCAT GCATGCATGCATGCA TGCATGTGCATGTGC */4%$ σʔλަ ྻσʔλ ϝλσʔλΛొ
www.ncbi.nlm.nih.gov/sra
www.ebi.ac.uk/ena
trace.ddbj.nig.ac.jp/dra
ͬͯΈΔ ˓ IVNBOCSFBTUDBODFSͷσʔλΛ୳ͯ͠ΈΔ
None
None
None
None
None
None
http://www.everystockphoto.com/photo.php?imageId=3972069
http://www.flickr.com/photos/mindaugasdanys/3766009204/ ͍ʹ͍͘
ෆຬ ˓ σʔλϕʔεͷߏ͕ෳࡶա͗Δ
σʔλ͕୳ͤͳ͍
ϛογϣϯ ˓ ެڞσʔλΛͬͱ୳͘͢͠ɼ͍͘͢͢Δ
%#$-4Ͱఏڙ͍ͯ͠Δ43"ؔ࿈αʔϏεɾπʔϧ
43"ͷσʔλΛ͍͘͢͢ΔͨΊʹ ˓ 43"ͷ֤छ౷ܭΛݩʹσʔλΛݕࡧ͢Δ ˓ ൃද͞ΕͨจΛݩʹσʔλΛݕࡧ͢Δ ˓ ࣬ױΛΩʔϫʔυʹσʔλΛݕࡧ͢Δ ˓ ొ͞ΕͨσʔλͷγʔέϯεΫΦϦςΟΛݟΔ ˓
ݕࡧ݁Ռʹؔ࿈͢ΔϝλσʔλΛޮΑ͘ϒϥδϯά͢Δ ˓ ֤43"ؔ࿈αʔϏεͷ"1*Λར༻͢Δ {@}
43"ͷ֤छ౷ܭΛݩʹσʔλΛݕࡧ͢Δ
43"ͷ֤छ౷ܭΛݩʹσʔλΛݕࡧ͢Δ ˓ ྻσʔλͱڞʹొ͞ΕΔϝλσʔλ ऍσʔλ Λूܭ ˓ γʔέϯαʔͷछྨɼαϯϓϧੜछɼ࣮ݧछΛϥϯΩϯάදࣔ ˓ σʔλొͷ৳ͼΛάϥϑͰදࣔ ͝རӹ
˓ ޮͷΑ͍ʮߜΓࠐΈݕࡧʯ ˓ ʮͲͷΑ͏ͳσʔλ͕ͲΕ͘Β͍͋Δͷ͔ʯ͕ҰͰ͔Δ ˓ ۀքͷτϨϯυΛՄࢹԽ
sra.dbcls.jp
“౷ܭ͔Β୳͢”
৳ͼͷάϥϑ
࣮ݧछผʹ৭͚
ൃද͞ΕͨจΛݩʹσʔλΛݕࡧ͢Δ
ൃද͞ΕͨจΛݩʹσʔλΛݕࡧ͢Δ ˓ 43"ͷσʔλʹจͷϦϯΫ͕ਵ͍ͯ͠ͳ͍ͷ͕ଟ͍ ˓ จ͕ग़Δલʹొ͞ΕΔσʔλ͕ଟ͍ͨΊ ˓ จݙͷத͔Βެ։σʔλͷݴٴΛநग़͠ɼ*%Λ݁ͼ͚ͭΔ ͝རӹ ˓ δϟʔφϧɼจൃද࣌ɼจλΠτϧͳͲͰιʔτͯ͠ݕࡧ
˓ ࣮ݧछɼੜछɼγʔέϯαͳͲͰߜࠐΈՄೳ
sra.dbcls.jp
“จݙ͔Β୳͢”
ߜࠐΈݕࡧ ֤ϑΟʔϧυͰฒସ͑
࣬ױΛΩʔϫʔυʹσʔλΛݕࡧ͢Δ
࣬ױΛΩʔϫʔυʹσʔλΛݕࡧ͢Δ ˓ /(4ҩֶܥͰͷར༻ଟ͍ ˓ 43"ผʹ͔Ε͍ͯͳ͍ͷͰݕࡧ͕ͮ͠Β͍ ˓ จݙʹਵ͢Δ.F4)UFSNΛݩʹσʔλΛ࣬ױͰཧ ˓ σʔλొͷଟ͍ͷɼ࣬ױͷΧςΰϦΛݩʹݕࡧՄೳ ͝རӹ
˓ .F4)ΩʔϫʔυΛݩʹ͍ͯ͠ΔͨΊਫ਼͕ߴ͍ ˓ ͰͷݚڀಈΛݟΔ͜ͱͰ͖Δ
sra.dbcls.jp
“࣬ױ͔ΒோΊΔ”
“සผ”
ొσʔλ ࣬ױؔ࿈ҨࢠDB GendooͷϦϯΫ
νΣοΫͯ͠ Search
MeSHͷؔ࿈λʔϜ ؔ࿈ʹԠͯ͡දࣔ
“࣬ױΧςΰϦผ”
πϦʔද͔ࣔΒ σʔλΛݕࡧ
ొ͞ΕͨσʔλͷγʔέϯεΫΦϦςΟΛݟΔ
ొ͞ΕͨσʔλͷγʔέϯεΫΦϦςΟΛݟΔ ˓ 43"ͷσʔλৗʹྑ͍ͷͱݶΒͳ͍ ˓ ϛεγʔέϯεొ͞Εσʔλ͕ೖ͍ͬͯΔ ˓ ͕݅ಉ͡σʔλͳΒਫ਼ͷྑ͍ͷΛ͍͍ͨ ˓ 'BTU2$ʹΑͬͯશͯͷ43"σʔλͷΫΦϦςΟΛܭࢉ ˓
IUUQXXXCJPJOGPSNBUJDTCBCSBIBNBDVLQSPKFDUTGBTURD ͝རӹ ˓ μϯϩʔυʹҰ൩͔͔ͬͨσʔλ͕յΕ͍ͯͨͱ͍͏൵ܶΛճආ ˓ γʔέϯεΫΦϦςΟͷൺֱΛ͢Δ͚ͩͰָ͍͠
g86.dbcls.jp/sra
SRA IDΛೖྗ
FastQCʹΑΔQC݁Ռ
APIʹΑΔΞΫηεՄೳ
ݕࡧ݁Ռʹؔ࿈͢ΔϝλσʔλΛޮΑ͘ϒϥδϯά͢Δ
ݕࡧ݁Ռʹؔ࿈͢ΔϝλσʔλΛޮΑ͘ϒϥδϯά͢Δ ˓ େྔͷݕࡧ݁Ռͷத͔ΒͲ͏ͬͯཉ͍͠σʔλΛݟ͚ͭग़͔͢ʁ ˓ αϯϓϧɼγʔέϯαʔͳͲͷ݅ΛൺΔ ˓ จ͕ग़͍ͯΔͷΛ༏ઌ͢Δ ˓ ݅Λൺֱ͢ΔͨΊʹɼؔ࿈͢ΔใΛͻͱ·ͱΊʹ͍ͨ͠ ͝རӹ
˓ αϯϓϧɼγʔέϯεɼϓϩδΣΫτͳͲͷ*%Ͱࠞཚ͠ͳ͍ ˓ จͷใΛซͤͯݟΔ͜ͱͰਖ਼֬ͳஅ͕Ͱ͖Δ
g86.dbcls.jp/kusarinoko
All, human, mouse, Arabidopsis ͔Βબ ΩʔϫʔυΛೖྗͯ͠ݕࡧ
จͷใ ώοτͨ͠σʔλͷϦετ
Study (project) Experiment Run (Sequence) / QC
Sample
֤43"ؔ࿈αʔϏεͷ"1*Λར༻͢Δ {@}
֤43"ؔ࿈αʔϏεͷ"1*Λར༻͢Δ ˓ ϓϩάϥϜΛͬͯ܁Γฦ͠ΞΫηε͍ͨ͠ ˓ Ұఆͷظؒ͝ͱʹಉ݅͡Ͱݕࡧ͍ͨ͠ ˓ େྔͷσʔλͷใΛݕࡧ͍ͨ͠ ͝རӹ ˓ ৗͷख͕ؒେ෯ʹݮΔ
ৗͷख͕ؒ େ෯ʹݮΔ
͝རӹ ˓ ৗͷख͕ؒେ෯ʹݮΔ
g86.dbcls.jp/sra
Sequence Quality
SRA IDม ϝλσʔλͷऔಘ
Ұ෦ͷΈެ։த ˓ ৗͷखؒΛେ෯ʹݮΒ͘͢Ӷҙ։ൃதͰ͢
4FRVFODF3FBE"SDIJWFϕετϓϥΫςΟε
ެڞγʔέϯεσʔλΛ͏खॱ ݕࡧ ϒϥδϯά μϯϩʔυ νΣοΫ ղੳ จʹهࡌ͞Εͨ*%ΩʔϫʔυͰݕࡧ ݕࡧ݁ՌΛݸผʹݟͯཉ͍͠σʔλΛ୳͢ '51"TQFSBͳͲͰσʔλΛμϯϩʔυ͢Δ μϯϩʔυͨ͠σʔλΛ֬ೝ͢Δ
ղੳʹར༻͢Δ
֤εςοϓΛޮԽίετμϯ ݕࡧ ϒϥδϯά μϯϩʔυ νΣοΫ ղੳ
ࣄલʹνΣοΫͯ͠μϯϩʔυͷίετԼ͛Δ ݕࡧ ϒϥδϯάɾνΣοΫ μϯϩʔυ ղੳ
ࣄલʹνΣοΫͯ͠μϯϩʔυͷίετԼ͛Δ ݕࡧ ϒϥδϯάɾνΣοΫ μϯϩʔυ ղੳ ౷ܭɼจɼ࣬ױɼΩʔϫʔυͰݕࡧ ݕࡧ݁Ռ͔ΒσʔλΛ୳͠ɼΫΦϦςΟ֬ೝ μϯϩʔυ
ެڞ/(4σʔλͷೋ࣍ར༻ྫ ˓ ࣗͷσʔλͱಉ݅͡ͷσʔλΛར༻ͯ͠/ΛՔ͙ ˓ ࣗͷσʔλͱؔ࿈͢ΔσʔλΛར༻͠ൺֱղੳΛߦ͏ ˓ ҟͳΔॲཧ۠ɼۙԑछɼ(FOPNF5SBOTDSJQUPNF&QJHFOPNFͳͲ ˓ ղੳπʔϧͷੑೳධՁʹར༻͢Δ ˓
ෳͷπʔϧͷൺֱɼ৽نπʔϧ։ൃ࣌ͷσϞσʔλͱͯ͠ ˓ σʔλΛେྔʹूΊͯϝλղੳΛߦ͏ ˓ੜछࡉ๔ͰԣஅతͳղੳͳͲ
43"͕ͬͱ͍͘͢ͳΓ·ͨ͠ ˓ ͜ΕͰͲΜͲΜެڞσʔλΛͬͯݚڀ͕Ͱ͖Δ
http://www.flickr.com/photos/mindaugasdanys/3766009204/ ·͍ͩʹ͍͘
ෆຬ ˓ ݕࡧ͕͍ ˓ πʔϧ͕όϥόϥͰ࿈ܞͮ͠Β͍
http://www.flickr.com/photos/66986780@N00/137720685/ վྑத
͠Β͓͍ͪͩ͘͘͞ ˓ ։ൃܧଓதɼϑΟʔυόοΫ͓͍ͪͯ͠·͢
ެڞσʔλϕʔεͱ/(4ɺ՝ͱ͜Ε͔Β
ԿނσʔλΛҰൠެ։͢Δͷ͔ ˓ ࠶ݱੑͷ୲อ ˓ ࠶ղੳͷखஈΛఏڙ͠ਖ਼ੑΛূ໌͢Δ ˓ ೋ࣍ར༻ͷଅਐ ˓ ϦιʔεΛγΣΞ͠ɼଞͷݚڀऀʹར༻ͯ͠Β͏
࣮ ˓ ग़ͤͱݴΘΕΔ͔Βग़͢ ˓ δϟʔφϧʹߘ͢Δࡍʹެڞ%#ͷ*%ΛٻΊΒΕΔ ˓ άϥϯτͷن্શͯͷσʔλΛެ։͠ͳ͚ΕͳΒͳ͍߹
http://www.flickr.com/photos/74521133@N00/232362142/ ग़͞ͳ͖Ό͍͚ͳ͍
͝རӹ͕ͳ͍ ˓ σʔλͷެ։ʹίετ͕͔͔Δ ˓ ଞਓ͕͍͍͢Α͏ʹ៉ྷʹཧ͢Δͷେม ˓ /(4σʔλαΠζ͕େ͖͍ͷͰΞοϓϩʔυ͢ΔͷҰۤ࿑ ˓ σʔλΛग़ͨ͠ਓͷϦεϖΫτ ˓
จൃදલͷσʔλΛୈࡾऀ͕ղੳͯ͠จʹʁ ˓ ݱঢ়ͰσʔλΛग़͚ͩ͢ͰۀʹͳΒͳ͍
ΞʔΧΠϒͲ͏͋Δ͖͔ ˓ σʔλެ։ͷෑډΛԼ͛Δ ˓ σʔλΛެ։͢ΔͨΊͷํ๏Λඪ४Խ͢Δ ˓ NJOJNVNJOGPSNBUJPOͳͲ ˓ ΑΓ؆୯ʹσʔλΛొ͢ΔͨΊͷΈ ˓
σʔλར༻Λଅਐ͢Δ ˓ ެ։σʔλΛΑΓ͍͘͢ཧ͢Δ ˓ ެ։σʔλΛ༗ޮར༻͢ΔͨΊͷํ๏ ˓ σʔλΛग़͢ਓʹ͝རӹΛ ˓ ެ։σʔλʹ%0*ΛৼͬͯҾ༻ΛՄೳʹ͢Δ
Ͳ͏͢Δ ˓ Ͳ͏͢Εʁ
http://www.flickr.com/photos/mindaugasdanys/3766009204/ ͕ΜΓ·͢
͝ਗ਼͋Γ͕ͱ͏͍͟͝·ͨ͠