Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
NextGen Sequencing data intro.
Search
David Rio Deiros
March 26, 2012
Research
3
220
NextGen Sequencing data intro.
Brief introduction to nextgen sequencing data.
David Rio Deiros
March 26, 2012
Tweet
Share
Other Decks in Research
See All in Research
クロスモーダル表現学習の研究動向: 音声関連を中心として
ryomasumura
3
630
Discovering Universal Geometry in Embeddings with ICA
momoseoyama
1
370
Breaking Tradeoffs: Extremely Scalable Multi-Agent Pathfinding Algorithms
kei18
0
150
僕たちがグラフニューラルネットワークを学ぶ理由
joisino
21
7.9k
Weekly AI Agents News!
masatoto
13
3.9k
データで診て考える合志市の渋滞と公共交通 ~めざせ 車1割削減、渋滞半減、公共交通2倍~
trafficbrain
0
480
VAR モデルによる OSS プロジェクト同士が生存性に与える 影響の分析
noppoman
0
140
Generative AI - practice and theory
gpeyre
1
600
Source Code Diff Revolution (JetBrains Open Reading Club)
tsantalis
0
300
People Driven Transformation / 人が起点の、社会の変え方
dmattsun
0
160
デフスポーツにおける支援技術 〜競技特性・ルールと技術との関係〜
slab
0
230
クリック率を最大化しない推薦システム
joisino
42
14k
Featured
See All Featured
[RailsConf 2023 Opening Keynote] The Magic of Rails
eileencodes
14
8.4k
Building a Scalable Design System with Sketch
lauravandoore
457
32k
The Art of Programming - Codeland 2020
erikaheidi
43
12k
Why Our Code Smells
bkeepers
PRO
331
56k
A designer walks into a library…
pauljervisheath
201
23k
How To Stay Up To Date on Web Technology
chriscoyier
782
250k
The Mythical Team-Month
searls
217
42k
Clear Off the Table
cherdarchuk
85
310k
What’s in a name? Adding method to the madness
productmarketing
PRO
17
2.7k
Understanding Cognitive Biases in Performance Measurement
bluesmoon
12
1k
How to train your dragon (web standard)
notwaldorf
75
5.2k
Bash Introduction
62gerente
605
210k
Transcript
Next-Gen Sequencing Data
@shiondev
@drio
None
Σ Bases DNA == “The Genome”
Σ Bases DNA == “The Genome” 3Gbp
…ACAGTTTTCAAGAGCCGGTTTTACTAGGATTATTACTG…
…ACAGTTTTCAAGAGCCGGTTTTACTAGGATTATTACTG… G
…ACAGTTTTCAAGAGCCGGTTTTACTAGGATTATTACTG… G
…ACAGTTTTCAAGAGCCGGTTTTACTAGGATTATTACTG… G T
…ACAGTTTTCAAGAGCCGGTTTTACTAGGATTATTACTG… G T -‐
…ACAGTTTTCAAGAGCCGGTTTTACTAGGATTATTACTG… G T -‐ A
…ACAGTTTTCAAGAGCCGGTTTTACTAGGATTATTACTG… G T -‐ A
SEQUENCING
None
# of Bases per day per machine
200kbp 2000
1Mbp 2003
200 Mbp 2005
3Gbp 2009
60Gbp 2012
what can we do with NGS data?
Re-sequencing
Re-sequencing Looking for changes in a Genome
Re-sequencing Looking for changes in a Genome (Given that we
have a HIGH quality reference)
Re-sequencing Looking for changes in a Genome (Given that we
have a HIGH quality reference) Consequences?
Reliably finding those changes is not easy
(1%-3%) of your bases may be errors.
30x
What’s the typical workflow in a re-sequencing project ?
Library preparation
Library preparation Sequencing
Library preparation Sequencing Analysis I (images -> reads)
.fastq ... >HWI-ST821_0129:5:1101:1927:2089#GATCAG/1 TGGACAACGGCCAGGTTAATGATGGGCAGGTAGAAGATGATCACT +HWI-ST821_0129:5:1101:1927:2089#GATCAG/1 ___ccccccYc[eff`]X`a^ef][RHP^_cXIYSXcXcfSWXcd ...
Library preparation Sequencing Analysis I (images -> reads) Analysis II
(alignments)
None
Library preparation Sequencing Analysis I (images -> reads) Analysis II
(alignments) Analysis III (Variant calling)
.vcf
Library preparation Sequencing Analysis I (images -> reads) Analysis II
(alignments) Analysis III (Variant calling) Annotation
Library preparation Sequencing Analysis I (images -> reads) Analysis II
(alignments) Analysis III (Variant calling) Annotation Science starts here …
None
Library preparation Sequencing Analysis I (images -> reads) Analysis II
(alignments) Analysis III (SNP calling) Annotation 5/15 Tb 150Gb 80G 16 8G 1G 1 400Mb 1
1 Genome (3-6 days) ~ 230Gb
1 Genome
Let’s do it again for N genomes
Let’s do it again for N genomes
None
None
None
None
None
None
None
personalize medicine
personalize medicine Tailor physician decisions and practices to individual patients
Let’s do it again for N genomes
None
None
Thanks!