Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
NextGen Sequencing data intro.
Search
David Rio Deiros
March 26, 2012
Research
3
230
NextGen Sequencing data intro.
Brief introduction to nextgen sequencing data.
David Rio Deiros
March 26, 2012
Tweet
Share
Other Decks in Research
See All in Research
Mamba-in-Mamba: Centralized Mamba-Cross-Scan in Tokenized Mamba Model for Hyperspectral Image Classification
satai
3
120
日本語新聞記事を用いた大規模言語モデルの暗記定量化 / LLMC2025
upura
0
260
まずはここから:Overleaf共同執筆・CopilotでAIコーディング入門・Codespacesで独立環境
matsui_528
2
640
RHO-1: Not All Tokens Are What You Need
sansan_randd
1
190
[RSJ25] Enhancing VLA Performance in Understanding and Executing Free-form Instructions via Visual Prompt-based Paraphrasing
keio_smilab
PRO
0
150
一人称視点映像解析の最先端(MIRU2025 チュートリアル)
takumayagi
6
3.9k
Google Agent Development Kit (ADK) 入門 🚀
mickey_kubo
2
2.2k
投資戦略202508
pw
0
570
J-RAGBench: 日本語RAGにおける Generator評価ベンチマークの構築
koki_itai
0
810
AlphaEarth Foundations: An embedding field model for accurate and efficient global mapping from sparse label data
satai
3
360
20250605_新交通システム推進議連_熊本都市圏「車1割削減、渋滞半減、公共交通2倍」から考える地方都市交通政策
trafficbrain
0
880
カスタマーサクセスの視点からAWS Summitの展示を考える~製品開発で活用できる勘所~
masakiokuda
2
210
Featured
See All Featured
Context Engineering - Making Every Token Count
addyosmani
7
280
CSS Pre-Processors: Stylus, Less & Sass
bermonpainter
359
30k
Making Projects Easy
brettharned
120
6.4k
Music & Morning Musume
bryan
46
6.9k
Design and Strategy: How to Deal with People Who Don’t "Get" Design
morganepeng
132
19k
CoffeeScript is Beautiful & I Never Want to Write Plain JavaScript Again
sstephenson
162
15k
Distributed Sagas: A Protocol for Coordinating Microservices
caitiem20
333
22k
Keith and Marios Guide to Fast Websites
keithpitt
411
23k
Testing 201, or: Great Expectations
jmmastey
45
7.7k
Practical Orchestrator
shlominoach
190
11k
Connecting the Dots Between Site Speed, User Experience & Your Business [WebExpo 2025]
tammyeverts
10
610
Principles of Awesome APIs and How to Build Them.
keavy
127
17k
Transcript
Next-Gen Sequencing Data
@shiondev
@drio
None
Σ Bases DNA == “The Genome”
Σ Bases DNA == “The Genome” 3Gbp
…ACAGTTTTCAAGAGCCGGTTTTACTAGGATTATTACTG…
…ACAGTTTTCAAGAGCCGGTTTTACTAGGATTATTACTG… G
…ACAGTTTTCAAGAGCCGGTTTTACTAGGATTATTACTG… G
…ACAGTTTTCAAGAGCCGGTTTTACTAGGATTATTACTG… G T
…ACAGTTTTCAAGAGCCGGTTTTACTAGGATTATTACTG… G T -‐
…ACAGTTTTCAAGAGCCGGTTTTACTAGGATTATTACTG… G T -‐ A
…ACAGTTTTCAAGAGCCGGTTTTACTAGGATTATTACTG… G T -‐ A
SEQUENCING
None
# of Bases per day per machine
200kbp 2000
1Mbp 2003
200 Mbp 2005
3Gbp 2009
60Gbp 2012
what can we do with NGS data?
Re-sequencing
Re-sequencing Looking for changes in a Genome
Re-sequencing Looking for changes in a Genome (Given that we
have a HIGH quality reference)
Re-sequencing Looking for changes in a Genome (Given that we
have a HIGH quality reference) Consequences?
Reliably finding those changes is not easy
(1%-3%) of your bases may be errors.
30x
What’s the typical workflow in a re-sequencing project ?
Library preparation
Library preparation Sequencing
Library preparation Sequencing Analysis I (images -> reads)
.fastq ... >HWI-ST821_0129:5:1101:1927:2089#GATCAG/1 TGGACAACGGCCAGGTTAATGATGGGCAGGTAGAAGATGATCACT +HWI-ST821_0129:5:1101:1927:2089#GATCAG/1 ___ccccccYc[eff`]X`a^ef][RHP^_cXIYSXcXcfSWXcd ...
Library preparation Sequencing Analysis I (images -> reads) Analysis II
(alignments)
None
Library preparation Sequencing Analysis I (images -> reads) Analysis II
(alignments) Analysis III (Variant calling)
.vcf
Library preparation Sequencing Analysis I (images -> reads) Analysis II
(alignments) Analysis III (Variant calling) Annotation
Library preparation Sequencing Analysis I (images -> reads) Analysis II
(alignments) Analysis III (Variant calling) Annotation Science starts here …
None
Library preparation Sequencing Analysis I (images -> reads) Analysis II
(alignments) Analysis III (SNP calling) Annotation 5/15 Tb 150Gb 80G 16 8G 1G 1 400Mb 1
1 Genome (3-6 days) ~ 230Gb
1 Genome
Let’s do it again for N genomes
Let’s do it again for N genomes
None
None
None
None
None
None
None
personalize medicine
personalize medicine Tailor physician decisions and practices to individual patients
Let’s do it again for N genomes
None
None
Thanks!