Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
NextGen Sequencing data intro.
Search
David Rio Deiros
March 26, 2012
Research
3
220
NextGen Sequencing data intro.
Brief introduction to nextgen sequencing data.
David Rio Deiros
March 26, 2012
Tweet
Share
Other Decks in Research
See All in Research
12
0325
0
200
文化が形作る音楽推薦の消費と、その逆
kuri8ive
0
200
情報処理学会関西支部2024年度定期講演会「自然言語処理と大規模言語モデルの基礎」
ksudoh
10
2.1k
LLM時代にLabは何をすべきか聞いて回った1年間
hargon24
1
530
渋谷Well-beingアンケート調査結果
shibuyasmartcityassociation
0
300
2024/10/30 産総研AIセミナー発表資料
keisuke198619
1
380
第 2 部 11 章「大規模言語モデルの研究開発から実運用に向けて」に向けて / MLOps Book Chapter 11
upura
0
430
チュートリアル:Mamba, Vision Mamba (Vim)
hf149
5
1.7k
新規のC言語処理系を実装することによる 組込みシステム研究にもたらす価値 についての考察
zacky1972
1
270
多様かつ継続的に変化する環境に適応する情報システム/thesis-defense-presentation
monochromegane
1
620
ベイズ的方法に基づく統計的因果推論の基礎
holyshun
0
620
非ガウス性と非線形性に基づく統計的因果探索
sshimizu2006
0
440
Featured
See All Featured
Building Adaptive Systems
keathley
38
2.3k
Building Flexible Design Systems
yeseniaperezcruz
327
38k
Improving Core Web Vitals using Speculation Rules API
sergeychernyshev
0
98
The MySQL Ecosystem @ GitHub 2015
samlambert
250
12k
BBQ
matthewcrist
85
9.4k
How To Stay Up To Date on Web Technology
chriscoyier
789
250k
Visualizing Your Data: Incorporating Mongo into Loggly Infrastructure
mongodb
44
9.3k
StorybookのUI Testing Handbookを読んだ
zakiyama
27
5.3k
Keith and Marios Guide to Fast Websites
keithpitt
410
22k
We Have a Design System, Now What?
morganepeng
51
7.3k
Put a Button on it: Removing Barriers to Going Fast.
kastner
59
3.6k
Automating Front-end Workflow
addyosmani
1366
200k
Transcript
Next-Gen Sequencing Data
@shiondev
@drio
None
Σ Bases DNA == “The Genome”
Σ Bases DNA == “The Genome” 3Gbp
…ACAGTTTTCAAGAGCCGGTTTTACTAGGATTATTACTG…
…ACAGTTTTCAAGAGCCGGTTTTACTAGGATTATTACTG… G
…ACAGTTTTCAAGAGCCGGTTTTACTAGGATTATTACTG… G
…ACAGTTTTCAAGAGCCGGTTTTACTAGGATTATTACTG… G T
…ACAGTTTTCAAGAGCCGGTTTTACTAGGATTATTACTG… G T -‐
…ACAGTTTTCAAGAGCCGGTTTTACTAGGATTATTACTG… G T -‐ A
…ACAGTTTTCAAGAGCCGGTTTTACTAGGATTATTACTG… G T -‐ A
SEQUENCING
None
# of Bases per day per machine
200kbp 2000
1Mbp 2003
200 Mbp 2005
3Gbp 2009
60Gbp 2012
what can we do with NGS data?
Re-sequencing
Re-sequencing Looking for changes in a Genome
Re-sequencing Looking for changes in a Genome (Given that we
have a HIGH quality reference)
Re-sequencing Looking for changes in a Genome (Given that we
have a HIGH quality reference) Consequences?
Reliably finding those changes is not easy
(1%-3%) of your bases may be errors.
30x
What’s the typical workflow in a re-sequencing project ?
Library preparation
Library preparation Sequencing
Library preparation Sequencing Analysis I (images -> reads)
.fastq ... >HWI-ST821_0129:5:1101:1927:2089#GATCAG/1 TGGACAACGGCCAGGTTAATGATGGGCAGGTAGAAGATGATCACT +HWI-ST821_0129:5:1101:1927:2089#GATCAG/1 ___ccccccYc[eff`]X`a^ef][RHP^_cXIYSXcXcfSWXcd ...
Library preparation Sequencing Analysis I (images -> reads) Analysis II
(alignments)
None
Library preparation Sequencing Analysis I (images -> reads) Analysis II
(alignments) Analysis III (Variant calling)
.vcf
Library preparation Sequencing Analysis I (images -> reads) Analysis II
(alignments) Analysis III (Variant calling) Annotation
Library preparation Sequencing Analysis I (images -> reads) Analysis II
(alignments) Analysis III (Variant calling) Annotation Science starts here …
None
Library preparation Sequencing Analysis I (images -> reads) Analysis II
(alignments) Analysis III (SNP calling) Annotation 5/15 Tb 150Gb 80G 16 8G 1G 1 400Mb 1
1 Genome (3-6 days) ~ 230Gb
1 Genome
Let’s do it again for N genomes
Let’s do it again for N genomes
None
None
None
None
None
None
None
personalize medicine
personalize medicine Tailor physician decisions and practices to individual patients
Let’s do it again for N genomes
None
None
Thanks!