METRICS 1M 8M 8M Number of Bases 15 Gb 96 Gb 99 Gb Number of Reads 542,168 4,350,437 5,089,596 Pol. Read Length (Mean) 29,047 bp 22,079 bp 19,451 Pol. Read Length (N50) 43,868 bp 36,813 bp 33,088 - Up to 8 Cells per machine run - Up to 0.8 Tb per machine run (80 hours)
Gb in 10 hours B. subtilis E. coli O. sativa METRICS 1M 8M - human 8M - human 8M - E.coli 8M - B.subtilis 8M - rice Number of Bases 15 Gb 96 Gb 99 Gb 86 Gb 116 Gb 85 Gb Number of Reads 542,168 4,350,437 5,089,596 5,362,197 5,997,224 5,095,529 Pol. Read Length (Mean) 29,047 bp 22,079 bp 19,451 bp 15,952 bp 19,303 bp 16,758 bp Pol. Read Length (N50) 43,868 bp 36,813 bp 33,088 bp 27,815 bp 32,932 bp 28,975 bp
Number of Bases 44 Gb 320 Gb 318 Gb Number of Reads 517,746 4,053,000 3,725,756 Pol. Read Length (Mean) 84,420 bp 78,877 bp 85,294 Pol. Read Length (N50) 167,267 bp 166,571 bp 173,706 10-12 kb human CCS template (20-30 hr acquisition): METRICS Number of >QV20 Bases 21 Gb Number of >QV20 Reads 1,855,642 CCS Insert Length (Mean) 11,322 bp CCS Read Score (Mean) 99.8% - Up to 8 Cells per machine run - Up to 2.4 Tb per machine run (200 hrs)
(re-engineered Unzip code) - Next release will use minimap instead of blasr for polishing 2. FALCON-Phase - https://www.biorxiv.org/content/early/2018/05/21/327064 - Works well in our hands; VGP has numerous data sets that could be tried
of full-length haplotigs with Proximo (Phase Genomics) - Scaffolds are chromosome-scale - We know: - order of contigs along scaffold - pairing of phase 0 and phase 1 - Run FALCON-Phase on scaffolds Scaffold Phase0 Contigs Rerun Phasing PARENTAL SNVS AFTER SCAFFOLD PHASING Output: Chromosome-scale, phased, diploid assembly! Scaffold 0 Scaffold 1
(re-engineered Unzip code) - Next release will use minimap instead of blasr for polishing 2. FALCON-Phase - https://www.biorxiv.org/content/early/2018/05/21/327064 - Works well in our hands; VGP has numerous data sets that could be tried 3. HiFi-based assembly - https://www.biorxiv.org/content/early/2019/01/13/519025 - Exploratory, only tested on very limited number of species (human, grape, tuna in progress), not yet a full workflow
- Have good CLR assembly (presented at webinar Nov 1, 2018) - Example CCS run performance: https://www.nature.com/webcasts/event/assembling-high-quality-genomes-to-solve-natures-mysteries/ METRICS 1M Number of Bases 42 Gb Number of Reads 351,297 Pol. Read Length (Mean) 119,705 bp Pol. Read Length (N50) 212,769 bp METRICS Number of >QV20 Bases 2.6 Gb Number of >QV20 Reads 203,524 CCS Insert Length (Mean) 12,624 bp CCS Read Score (Median) Q33
(re-engineered Unzip code) - Next release will use minimap instead of blasr for polishing 2. FALCON-Phase - https://www.biorxiv.org/content/early/2018/05/21/327064 - Works well in our hands; VGP has numerous data sets that could be tried 3. HiFi-based assembly - https://www.biorxiv.org/content/early/2019/01/13/519025 - Exploratory, only tested on very limited number of species (human, grape, tuna in progress), not yet a full workflow 4. HiFi + Hi-C - Map Hi-C data directly to the reads - Another way of ‘binning’ the reads for samples where parents not available (may be easier than raw reads)
de-multiplexing accuracy - Improved artifact detection - Same transcript recovery as Iso-Seq 1 and 2 - Works for whole and targeted transcriptome IsoPhase: - Allele-specific expression resolution
size of 2 Gb - Throughput per day: - 60 Gb / SMRT Cell (long-insert, 2 x 10 hr) - 60-fold CLR coverage for 2 Gb genome (good for 1 species @60x traditional long-insert assembly) - 300 Gb / SMRT Cell (Hi-Fi mode, 24 hr) - 10-fold HiFi coverage (good for 0.5 species @20x Hi-Fi read assembly) - CLR approach: - 1 species per day = ~30 species per instrument per month - 2 instruments = ~60 species per month, ~200 species in 4 months Sequence ~200 genomes by September