$30 off During Our Annual Pro Sale. View Details »
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
NextGen Sequencing data intro.
Search
David Rio Deiros
March 26, 2012
Research
3
230
NextGen Sequencing data intro.
Brief introduction to nextgen sequencing data.
David Rio Deiros
March 26, 2012
Tweet
Share
Other Decks in Research
See All in Research
論文読み会 SNLP2025 Learning Dynamics of LLM Finetuning. In: ICLR 2025
s_mizuki_nlp
0
350
Combining Deep Learning and Street View Imagery to Map Smallholder Crop Types
satai
3
270
教師あり学習と強化学習で作る 最強の数学特化LLM
analokmaus
2
730
AIグラフィックデザインの進化:断片から統合(One Piece)へ / From Fragment to One Piece: A Survey on AI-Driven Graphic Design
shunk031
0
570
単施設でできる臨床研究の考え方
shuntaros
0
3.3k
情報技術の社会実装に向けた応用と課題:ニュースメディアの事例から / appmech-jsce 2025
upura
0
280
[RSJ25] Enhancing VLA Performance in Understanding and Executing Free-form Instructions via Visual Prompt-based Paraphrasing
keio_smilab
PRO
0
180
POI: Proof of Identity
katsyoshi
0
120
湯村研究室の紹介2025 / yumulab2025
yumulab
0
240
SREのためのテレメトリー技術の探究 / Telemetry for SRE
yuukit
13
2.5k
空間音響処理における物理法則に基づく機械学習
skoyamalab
0
120
[IBIS 2025] 深層基盤モデルのための強化学習驚きから理論にもとづく納得へ
akifumi_wachi
15
8.1k
Featured
See All Featured
How to train your dragon (web standard)
notwaldorf
97
6.4k
Leading Effective Engineering Teams in the AI Era
addyosmani
8
1.3k
Understanding Cognitive Biases in Performance Measurement
bluesmoon
32
2.7k
Fashionably flexible responsive web design (full day workshop)
malarkey
407
66k
[RailsConf 2023] Rails as a piece of cake
palkan
58
6.2k
Measuring & Analyzing Core Web Vitals
bluesmoon
9
710
ReactJS: Keep Simple. Everything can be a component!
pedronauck
666
130k
Why Our Code Smells
bkeepers
PRO
340
57k
Easily Structure & Communicate Ideas using Wireframe
afnizarnur
194
17k
Rebuilding a faster, lazier Slack
samanthasiow
84
9.3k
Automating Front-end Workflow
addyosmani
1371
200k
Imperfection Machines: The Place of Print at Facebook
scottboms
269
13k
Transcript
Next-Gen Sequencing Data
@shiondev
@drio
None
Σ Bases DNA == “The Genome”
Σ Bases DNA == “The Genome” 3Gbp
…ACAGTTTTCAAGAGCCGGTTTTACTAGGATTATTACTG…
…ACAGTTTTCAAGAGCCGGTTTTACTAGGATTATTACTG… G
…ACAGTTTTCAAGAGCCGGTTTTACTAGGATTATTACTG… G
…ACAGTTTTCAAGAGCCGGTTTTACTAGGATTATTACTG… G T
…ACAGTTTTCAAGAGCCGGTTTTACTAGGATTATTACTG… G T -‐
…ACAGTTTTCAAGAGCCGGTTTTACTAGGATTATTACTG… G T -‐ A
…ACAGTTTTCAAGAGCCGGTTTTACTAGGATTATTACTG… G T -‐ A
SEQUENCING
None
# of Bases per day per machine
200kbp 2000
1Mbp 2003
200 Mbp 2005
3Gbp 2009
60Gbp 2012
what can we do with NGS data?
Re-sequencing
Re-sequencing Looking for changes in a Genome
Re-sequencing Looking for changes in a Genome (Given that we
have a HIGH quality reference)
Re-sequencing Looking for changes in a Genome (Given that we
have a HIGH quality reference) Consequences?
Reliably finding those changes is not easy
(1%-3%) of your bases may be errors.
30x
What’s the typical workflow in a re-sequencing project ?
Library preparation
Library preparation Sequencing
Library preparation Sequencing Analysis I (images -> reads)
.fastq ... >HWI-ST821_0129:5:1101:1927:2089#GATCAG/1 TGGACAACGGCCAGGTTAATGATGGGCAGGTAGAAGATGATCACT +HWI-ST821_0129:5:1101:1927:2089#GATCAG/1 ___ccccccYc[eff`]X`a^ef][RHP^_cXIYSXcXcfSWXcd ...
Library preparation Sequencing Analysis I (images -> reads) Analysis II
(alignments)
None
Library preparation Sequencing Analysis I (images -> reads) Analysis II
(alignments) Analysis III (Variant calling)
.vcf
Library preparation Sequencing Analysis I (images -> reads) Analysis II
(alignments) Analysis III (Variant calling) Annotation
Library preparation Sequencing Analysis I (images -> reads) Analysis II
(alignments) Analysis III (Variant calling) Annotation Science starts here …
None
Library preparation Sequencing Analysis I (images -> reads) Analysis II
(alignments) Analysis III (SNP calling) Annotation 5/15 Tb 150Gb 80G 16 8G 1G 1 400Mb 1
1 Genome (3-6 days) ~ 230Gb
1 Genome
Let’s do it again for N genomes
Let’s do it again for N genomes
None
None
None
None
None
None
None
personalize medicine
personalize medicine Tailor physician decisions and practices to individual patients
Let’s do it again for N genomes
None
None
Thanks!