Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Hunting for viruses in French Guiana
Search
Nacho Caballero
April 29, 2014
Science
0
57
Hunting for viruses in French Guiana
Lab meeting presentation about my work doing viral metagenomic analysis in French Guiana
Nacho Caballero
April 29, 2014
Tweet
Share
More Decks by Nacho Caballero
See All by Nacho Caballero
Bridging data analysis and interactive visualization
nachocab
0
42
Other Decks in Science
See All in Science
モンテカルロDCF法による事業価値の算出(モンテカルロ法とベイズモデリング) / Business Valuation Using Monte Carlo DCF Method (Monte Carlo Simulation and Bayesian Modeling)
ikuma_w
0
170
実力評価性能を考慮した弓道高校生全国大会の大会制度設計の提案 / (konakalab presentation at MSS 2025.03)
konakalab
2
170
07_浮世満理子_アイディア高等学院学院長_一般社団法人全国心理業連合会代表理事_紹介資料.pdf
sip3ristex
0
480
[第62回 CV勉強会@関東] Long-CLIP: Unlocking the Long-Text Capability of CLIP / kantoCV 62th ECCV 2024
lychee1223
1
940
データベース10: 拡張実体関連モデル
trycycle
PRO
0
690
Cross-Media Information Spaces and Architectures (CISA)
signer
PRO
3
31k
Design of three-dimensional binary manipulators for pick-and-place task avoiding obstacles (IECON2024)
konakalab
0
210
CV_3_Keypoints
hachama
0
190
「美は世界を救う」を心理学で実証したい~クラファンを通じた新しい研究方法
jimpe_hitsuwari
1
130
02_西村訓弘_プログラムディレクター_人口減少を機にひらく未来社会.pdf
sip3ristex
0
480
Quelles valorisations des logiciels vers le monde socio-économique dans un contexte de Science Ouverte ?
bluehats
1
400
安心・効率的な医療現場の実現へ ~オンプレAI & ノーコードワークフローで進める業務改革~
siyoo
0
250
Featured
See All Featured
Fashionably flexible responsive web design (full day workshop)
malarkey
407
66k
Side Projects
sachag
455
42k
A designer walks into a library…
pauljervisheath
207
24k
[RailsConf 2023 Opening Keynote] The Magic of Rails
eileencodes
29
9.5k
Producing Creativity
orderedlist
PRO
346
40k
Exploring the Power of Turbo Streams & Action Cable | RailsConf2023
kevinliebholz
34
5.9k
A Modern Web Designer's Workflow
chriscoyier
694
190k
10 Git Anti Patterns You Should be Aware of
lemiorhan
PRO
657
60k
jQuery: Nuts, Bolts and Bling
dougneiner
63
7.8k
Rebuilding a faster, lazier Slack
samanthasiow
82
9.1k
Connecting the Dots Between Site Speed, User Experience & Your Business [WebExpo 2025]
tammyeverts
5
230
Art, The Web, and Tiny UX
lynnandtonic
299
21k
Transcript
French Guiana Virus Hunting in Nacho Caballero
French Guiana
Rodents Bats
Rodents Bats Leishmania
Capture
Capture Isolate viral particles
Capture Isolate viral particles Extract RNA
Capture Isolate viral particles Extract RNA Sequence
Estimated read coverage % reads with coverage smaller than x
Rodents
Estimated read coverage % reads with coverage smaller than x
Rodents
Estimated read coverage % reads with coverage smaller than x
Rodents Bats
Read How can we estimate the coverage without a reference
genome?
Read How can we estimate the coverage without a reference
genome?
K-mers Read How can we estimate the coverage without a
reference genome?
How can we estimate the coverage without a reference genome?
1 1 1 1 1 1 1 How can we
estimate the coverage without a reference genome?
7 8 10 8 11 3 6
7 8 10 8 11 3 6 Median k-mer count
≈ Read coverage
None
k-mers make it possible to align without a reference
None
Problem: each sequencing error introduces k erroneous k-mers
Problem: each sequencing error introduces k erroneous k-mers
7 8 10 8 11 3 6 Over a threshold,
additional reads are redundant
5 5 5 5 5 3 5 Solution: digital normalization
reduces redundancy and errors
Assembly
Assembly SPADes
Assembly Alignment
Assembly Alignment BLAST
Assembly Taxonomy Alignment
Assembly Taxonomy Alignment NCBI
Problem: 67% of contigs in rodent dataset (serum) align to
human sequences
Problem: 67% of contigs in rodent dataset (serum) align to
human sequences Night-heron coronavirus HKU19 (1 Kb) Simian hemorrhagic fever virus (300 bp) Equine arteritis virus (3.7 Kb) Possum nidovirus Rodent hepacivirus Chipmunk parvovirus Theiler's disease-associated virus Reticuloendotheliosis virus Mosquito VEM Anellovirus SDBVL A Porcine reproductive and respiratory syndrome virus Dragonfly-associated circular virus 1 Gemycircularvirus 3 Rodent pegivirus Cyclovirus PK5510 Hypericum japonicum associated circular DNA virus
Pig stool associated circular ssDNA virus (1Kb) Avian gyrovirus 2
Torque teno sus virus 1a Mosquito VEM virus SDBVL G Turdivirus 3 Problem: 92% of contigs in bat dataset (droppings) don’t align to anything in NCBI
Lymphocytic choriomeningitis virus (7kb) Hepatitis C virus Amphotropic murine leukemia
virus Murid herpesvirus 1 Mosquito VEM Anellovirus SDBVL A Rat retrovirus SC1 Mason-Pfizer monkey virus (retrovirus) Eidolon helvum parvovirus 2 Periplaneta fuliginosa densovirus (also a parvovirus) Moloney murine sarcoma virus Sclerotinia sclerotiorum hypovirulence associated DNA virus 1 Problem: 95% of contigs in rodent dataset 2 (serum, spleen) align to mouse sequences (2)
7 out of 10 samples contained more than 1Kb of
Leishmania RNA virus (94% ident) 5 Kb genome
Lessons
Assume that 50% of your samples are going to fail
Lessons
Assume that 50% of your samples are going to fail
Lessons Design a small experiment, then iterate
Assume that 50% of your samples are going to fail
Lessons Design a small experiment, then iterate Come up with excuses to learn