Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Hunting for viruses in French Guiana
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
Nacho Caballero
April 29, 2014
Science
63
0
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
Hunting for viruses in French Guiana
Lab meeting presentation about my work doing viral metagenomic analysis in French Guiana
Nacho Caballero
April 29, 2014
More Decks by Nacho Caballero
See All by Nacho Caballero
Bridging data analysis and interactive visualization
nachocab
0
47
Other Decks in Science
See All in Science
Non-Gaussian, nonlinear causal discovery with hidden variables and application
sshimizu2006
0
140
機械学習 - 授業概要
trycycle
PRO
0
540
Inside the Mind of an LLM
baggiponte
0
190
ITTF卓球世界ランキングのポイント比を用いた試合結果予測モデルの性能評価 / Performance evaluation of match result prediction models using the point ratio of the ITTF Table Tennis World Ranking
konakalab
0
130
Conversation is the New Dashboard: 属人性を排除する第4世代BIツールの勢力図
shomaekawa
1
600
Bear-safety-running
akirun_run
0
160
【論文紹介】Is CLIP ideal? No. Can we fix it?Yes! 第65回 コンピュータビジョン勉強会@関東
shun6211
5
2.5k
ハミルトン・ヤコビ方程式の解の性質と物理的意味
enakai00
0
690
ダメな自分の育て方―性格タイプの「劣等機能」から理解するニガテ克服術
ppillc
0
170
医療 LLM ベンチマークの現在地:多面的評価 と日本ローカライズ
analokmaus
1
530
AkarengaLT vol.40
hashimoto_kei
0
110
Physical AIを支えるWeights & Biases
olachinkei
1
390
Featured
See All Featured
Building AI with AI
inesmontani
PRO
1
1.1k
XXLCSS - How to scale CSS and keep your sanity
sugarenia
250
1.3M
How to Think Like a Performance Engineer
csswizardry
28
2.7k
Practical Tips for Bootstrapping Information Extraction Pipelines
honnibal
25
2k
Art, The Web, and Tiny UX
lynnandtonic
304
22k
Building a A Zero-Code AI SEO Workflow
portentint
PRO
0
610
How to Talk to Developers About Accessibility
jct
2
250
Fantastic passwords and where to find them - at NoRuKo
philnash
52
3.7k
ピンチをチャンスに:未来をつくるプロダクトロードマップ #pmconf2020
aki_iinuma
128
56k
Google's AI Overviews - The New Search
badams
0
1k
The Psychology of Web Performance [Beyond Tellerrand 2023]
tammyeverts
49
3.5k
KATA
mclloyd
PRO
35
15k
Transcript
French Guiana Virus Hunting in Nacho Caballero
French Guiana
Rodents Bats
Rodents Bats Leishmania
Capture
Capture Isolate viral particles
Capture Isolate viral particles Extract RNA
Capture Isolate viral particles Extract RNA Sequence
Estimated read coverage % reads with coverage smaller than x
Rodents
Estimated read coverage % reads with coverage smaller than x
Rodents
Estimated read coverage % reads with coverage smaller than x
Rodents Bats
Read How can we estimate the coverage without a reference
genome?
Read How can we estimate the coverage without a reference
genome?
K-mers Read How can we estimate the coverage without a
reference genome?
How can we estimate the coverage without a reference genome?
1 1 1 1 1 1 1 How can we
estimate the coverage without a reference genome?
7 8 10 8 11 3 6
7 8 10 8 11 3 6 Median k-mer count
≈ Read coverage
None
k-mers make it possible to align without a reference
None
Problem: each sequencing error introduces k erroneous k-mers
Problem: each sequencing error introduces k erroneous k-mers
7 8 10 8 11 3 6 Over a threshold,
additional reads are redundant
5 5 5 5 5 3 5 Solution: digital normalization
reduces redundancy and errors
Assembly
Assembly SPADes
Assembly Alignment
Assembly Alignment BLAST
Assembly Taxonomy Alignment
Assembly Taxonomy Alignment NCBI
Problem: 67% of contigs in rodent dataset (serum) align to
human sequences
Problem: 67% of contigs in rodent dataset (serum) align to
human sequences Night-heron coronavirus HKU19 (1 Kb) Simian hemorrhagic fever virus (300 bp) Equine arteritis virus (3.7 Kb) Possum nidovirus Rodent hepacivirus Chipmunk parvovirus Theiler's disease-associated virus Reticuloendotheliosis virus Mosquito VEM Anellovirus SDBVL A Porcine reproductive and respiratory syndrome virus Dragonfly-associated circular virus 1 Gemycircularvirus 3 Rodent pegivirus Cyclovirus PK5510 Hypericum japonicum associated circular DNA virus
Pig stool associated circular ssDNA virus (1Kb) Avian gyrovirus 2
Torque teno sus virus 1a Mosquito VEM virus SDBVL G Turdivirus 3 Problem: 92% of contigs in bat dataset (droppings) don’t align to anything in NCBI
Lymphocytic choriomeningitis virus (7kb) Hepatitis C virus Amphotropic murine leukemia
virus Murid herpesvirus 1 Mosquito VEM Anellovirus SDBVL A Rat retrovirus SC1 Mason-Pfizer monkey virus (retrovirus) Eidolon helvum parvovirus 2 Periplaneta fuliginosa densovirus (also a parvovirus) Moloney murine sarcoma virus Sclerotinia sclerotiorum hypovirulence associated DNA virus 1 Problem: 95% of contigs in rodent dataset 2 (serum, spleen) align to mouse sequences (2)
7 out of 10 samples contained more than 1Kb of
Leishmania RNA virus (94% ident) 5 Kb genome
Lessons
Assume that 50% of your samples are going to fail
Lessons
Assume that 50% of your samples are going to fail
Lessons Design a small experiment, then iterate
Assume that 50% of your samples are going to fail
Lessons Design a small experiment, then iterate Come up with excuses to learn