Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
NextGen Sequencing data intro.
Search
David Rio Deiros
March 26, 2012
Research
3
230
NextGen Sequencing data intro.
Brief introduction to nextgen sequencing data.
David Rio Deiros
March 26, 2012
Tweet
Share
Other Decks in Research
See All in Research
【SIGGRAPH Asia 2025】Lo-Fi Photograph with Lo-Fi Communication
toremolo72
0
110
説明可能な機械学習と数理最適化
kelicht
2
920
HU Berlin: Industrial-Strength Natural Language Processing with spaCy and Prodigy
inesmontani
PRO
0
200
地域丸ごとデイサービス「Go トレ」の紹介
smartfukushilab1
0
900
都市交通マスタープランとその後への期待@熊本商工会議所・熊本経済同友会
trafficbrain
0
120
[Devfest Incheon 2025] 모두를 위한 친절한 언어모델(LLM) 학습 가이드
beomi
2
1.4k
一般道の交通量減少と速度低下についての全国分析と熊本市におけるケーススタディ(20251122 土木計画学研究発表会)
trafficbrain
0
150
生成AI による論文執筆サポート・ワークショップ 論文執筆・推敲編 / Generative AI-Assisted Paper Writing Support Workshop: Drafting and Revision Edition
ks91
PRO
0
120
SREのためのテレメトリー技術の探究 / Telemetry for SRE
yuukit
13
3k
2025-11-21-DA-10th-satellite
yegusa
0
110
音声感情認識技術の進展と展望
nagase
0
460
空間音響処理における物理法則に基づく機械学習
skoyamalab
0
190
Featured
See All Featured
CSS Pre-Processors: Stylus, Less & Sass
bermonpainter
359
30k
Designing for Performance
lara
610
70k
Bash Introduction
62gerente
615
210k
Self-Hosted WebAssembly Runtime for Runtime-Neutral Checkpoint/Restore in Edge–Cloud Continuum
chikuwait
0
320
svc-hook: hooking system calls on ARM64 by binary rewriting
retrage
1
97
Put a Button on it: Removing Barriers to Going Fast.
kastner
60
4.2k
Reflections from 52 weeks, 52 projects
jeffersonlam
356
21k
New Earth Scene 8
popppiees
1
1.5k
Practical Tips for Bootstrapping Information Extraction Pipelines
honnibal
25
1.7k
Marketing to machines
jonoalderson
1
4.6k
Color Theory Basics | Prateek | Gurzu
gurzu
0
190
AI in Enterprises - Java and Open Source to the Rescue
ivargrimstad
0
1.1k
Transcript
Next-Gen Sequencing Data
@shiondev
@drio
None
Σ Bases DNA == “The Genome”
Σ Bases DNA == “The Genome” 3Gbp
…ACAGTTTTCAAGAGCCGGTTTTACTAGGATTATTACTG…
…ACAGTTTTCAAGAGCCGGTTTTACTAGGATTATTACTG… G
…ACAGTTTTCAAGAGCCGGTTTTACTAGGATTATTACTG… G
…ACAGTTTTCAAGAGCCGGTTTTACTAGGATTATTACTG… G T
…ACAGTTTTCAAGAGCCGGTTTTACTAGGATTATTACTG… G T -‐
…ACAGTTTTCAAGAGCCGGTTTTACTAGGATTATTACTG… G T -‐ A
…ACAGTTTTCAAGAGCCGGTTTTACTAGGATTATTACTG… G T -‐ A
SEQUENCING
None
# of Bases per day per machine
200kbp 2000
1Mbp 2003
200 Mbp 2005
3Gbp 2009
60Gbp 2012
what can we do with NGS data?
Re-sequencing
Re-sequencing Looking for changes in a Genome
Re-sequencing Looking for changes in a Genome (Given that we
have a HIGH quality reference)
Re-sequencing Looking for changes in a Genome (Given that we
have a HIGH quality reference) Consequences?
Reliably finding those changes is not easy
(1%-3%) of your bases may be errors.
30x
What’s the typical workflow in a re-sequencing project ?
Library preparation
Library preparation Sequencing
Library preparation Sequencing Analysis I (images -> reads)
.fastq ... >HWI-ST821_0129:5:1101:1927:2089#GATCAG/1 TGGACAACGGCCAGGTTAATGATGGGCAGGTAGAAGATGATCACT +HWI-ST821_0129:5:1101:1927:2089#GATCAG/1 ___ccccccYc[eff`]X`a^ef][RHP^_cXIYSXcXcfSWXcd ...
Library preparation Sequencing Analysis I (images -> reads) Analysis II
(alignments)
None
Library preparation Sequencing Analysis I (images -> reads) Analysis II
(alignments) Analysis III (Variant calling)
.vcf
Library preparation Sequencing Analysis I (images -> reads) Analysis II
(alignments) Analysis III (Variant calling) Annotation
Library preparation Sequencing Analysis I (images -> reads) Analysis II
(alignments) Analysis III (Variant calling) Annotation Science starts here …
None
Library preparation Sequencing Analysis I (images -> reads) Analysis II
(alignments) Analysis III (SNP calling) Annotation 5/15 Tb 150Gb 80G 16 8G 1G 1 400Mb 1
1 Genome (3-6 days) ~ 230Gb
1 Genome
Let’s do it again for N genomes
Let’s do it again for N genomes
None
None
None
None
None
None
None
personalize medicine
personalize medicine Tailor physician decisions and practices to individual patients
Let’s do it again for N genomes
None
None
Thanks!