Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
NextGen Sequencing data intro.
Search
David Rio Deiros
March 26, 2012
Research
3
230
NextGen Sequencing data intro.
Brief introduction to nextgen sequencing data.
David Rio Deiros
March 26, 2012
Tweet
Share
Other Decks in Research
See All in Research
最適決定木を用いた処方的価格最適化
mickey_kubo
4
1.7k
EarthMarker: A Visual Prompting Multimodal Large Language Model for Remote Sensing
satai
3
330
Towards a More Efficient Reasoning LLM: AIMO2 Solution Summary and Introduction to Fast-Math Models
analokmaus
2
210
Principled AI ~深層学習時代における課題解決の方法論~
taniai
3
1.2k
データサイエンティストの採用に関するアンケート
datascientistsociety
PRO
0
960
最適化と機械学習による問題解決
mickey_kubo
0
140
Trust No Bot? Forging Confidence in AI for Software Engineering
tomzimmermann
1
240
利用シーンを意識した推薦システム〜SpotifyとAmazonの事例から〜
kuri8ive
1
200
When Submarine Cables Go Dark: Examining the Web Services Resilience Amid Global Internet Disruptions
irvin
0
190
GeoCLIP: Clip-Inspired Alignment between Locations and Images for Effective Worldwide Geo-localization
satai
3
230
90 分で学ぶ P 対 NP 問題
e869120
17
7.4k
NLP2025 WS Shared Task 文法誤り訂正部門 ehiMetrick
sugiyamaseiji
0
190
Featured
See All Featured
Sharpening the Axe: The Primacy of Toolmaking
bcantrill
43
2.4k
Build The Right Thing And Hit Your Dates
maggiecrowley
36
2.8k
How STYLIGHT went responsive
nonsquared
100
5.6k
How to Create Impact in a Changing Tech Landscape [PerfNow 2023]
tammyeverts
53
2.8k
Chrome DevTools: State of the Union 2024 - Debugging React & Beyond
addyosmani
7
700
Balancing Empowerment & Direction
lara
1
340
Why You Should Never Use an ORM
jnunemaker
PRO
56
9.4k
Measuring & Analyzing Core Web Vitals
bluesmoon
7
480
Statistics for Hackers
jakevdp
799
220k
JavaScript: Past, Present, and Future - NDC Porto 2020
reverentgeek
48
5.4k
RailsConf 2023
tenderlove
30
1.1k
Mobile First: as difficult as doing things right
swwweet
223
9.7k
Transcript
Next-Gen Sequencing Data
@shiondev
@drio
None
Σ Bases DNA == “The Genome”
Σ Bases DNA == “The Genome” 3Gbp
…ACAGTTTTCAAGAGCCGGTTTTACTAGGATTATTACTG…
…ACAGTTTTCAAGAGCCGGTTTTACTAGGATTATTACTG… G
…ACAGTTTTCAAGAGCCGGTTTTACTAGGATTATTACTG… G
…ACAGTTTTCAAGAGCCGGTTTTACTAGGATTATTACTG… G T
…ACAGTTTTCAAGAGCCGGTTTTACTAGGATTATTACTG… G T -‐
…ACAGTTTTCAAGAGCCGGTTTTACTAGGATTATTACTG… G T -‐ A
…ACAGTTTTCAAGAGCCGGTTTTACTAGGATTATTACTG… G T -‐ A
SEQUENCING
None
# of Bases per day per machine
200kbp 2000
1Mbp 2003
200 Mbp 2005
3Gbp 2009
60Gbp 2012
what can we do with NGS data?
Re-sequencing
Re-sequencing Looking for changes in a Genome
Re-sequencing Looking for changes in a Genome (Given that we
have a HIGH quality reference)
Re-sequencing Looking for changes in a Genome (Given that we
have a HIGH quality reference) Consequences?
Reliably finding those changes is not easy
(1%-3%) of your bases may be errors.
30x
What’s the typical workflow in a re-sequencing project ?
Library preparation
Library preparation Sequencing
Library preparation Sequencing Analysis I (images -> reads)
.fastq ... >HWI-ST821_0129:5:1101:1927:2089#GATCAG/1 TGGACAACGGCCAGGTTAATGATGGGCAGGTAGAAGATGATCACT +HWI-ST821_0129:5:1101:1927:2089#GATCAG/1 ___ccccccYc[eff`]X`a^ef][RHP^_cXIYSXcXcfSWXcd ...
Library preparation Sequencing Analysis I (images -> reads) Analysis II
(alignments)
None
Library preparation Sequencing Analysis I (images -> reads) Analysis II
(alignments) Analysis III (Variant calling)
.vcf
Library preparation Sequencing Analysis I (images -> reads) Analysis II
(alignments) Analysis III (Variant calling) Annotation
Library preparation Sequencing Analysis I (images -> reads) Analysis II
(alignments) Analysis III (Variant calling) Annotation Science starts here …
None
Library preparation Sequencing Analysis I (images -> reads) Analysis II
(alignments) Analysis III (SNP calling) Annotation 5/15 Tb 150Gb 80G 16 8G 1G 1 400Mb 1
1 Genome (3-6 days) ~ 230Gb
1 Genome
Let’s do it again for N genomes
Let’s do it again for N genomes
None
None
None
None
None
None
None
personalize medicine
personalize medicine Tailor physician decisions and practices to individual patients
Let’s do it again for N genomes
None
None
Thanks!