following Github Gist to find each command for this session. Commands should be copy/pasted from this Gist Aaron Quinlan University of Utah ! ! ! ! ! quinlanlab.org 1 https://gist.github.com/arq5x/9e1928638397ba45da2e#file-autosomal-recessive-sh
confidence as deleterious alleles on different chromosomes * Convention for phased genotype is maternal allele first G|A T|C Both sites phasable: yet exclude as deleterious alleles on same chromosomes A/G A/A A/G C/T C/T C/T G/A A/A A/G C/C C/T 9
confidence as deleterious alleles on different chromosomes * Convention for phased genotype is maternal allele first G|A T|C Both sites phasable: yet exclude as deleterious alleles on same chromosomes G|A ? Only one site is phasable: lower confidence but cannot necessarily exclude. A/G A/A A/G C/T C/T C/T G/G A/A A/G C/T C/T C/T G/A A/A A/G C/C C/T 9
confidence as deleterious alleles on different chromosomes * Convention for phased genotype is maternal allele first G|A T|C Both sites phasable: yet exclude as deleterious alleles on same chromosomes G|A ? Only one site is phasable: lower confidence but cannot necessarily exclude. A/G A/A A/G C/T C/T C/T G/G A/A A/G C/T C/T C/T G/A A/A A/G C/C C/T A/G A/G A/G C/T C/T C/T 9
confidence as deleterious alleles on different chromosomes * Convention for phased genotype is maternal allele first G|A T|C Both sites phasable: yet exclude as deleterious alleles on same chromosomes G|A ? Only one site is phasable: lower confidence but cannot necessarily exclude. A/G A/A A/G C/T C/T C/T G/G A/A A/G C/T C/T C/T G/A A/A A/G C/C C/T A/G A/G A/G C/T C/T C/T ? ? Neither site is phasable: lower confidence but cannot necessarily exclude (recombination?). 9
VCF has been normalized and decomposed with VT 2. The VCF has been annotated with VEP. $ curl https://s3.amazonaws.com/gemini-‐tutorials/trio.trim.vep.vcf.gz > trio.trim.vep.vcf.gz $ curl https://s3.amazonaws.com/gemini-‐tutorials/recessive.ped > recessive.ped $ gemini load -‐-‐cores 4\ -‐v trio.trim.vep.vcf.gz \ -‐t VEP \ -‐-‐skip-‐gene-‐tables \ -‐p recessive.ped \ ! trio.trim.vep.recessive.db Note: copy and paste the full command from the Github Gist to avoid errors 4805 1847 1805 http://gemini.readthedocs.org/en/latest/content/preprocessing.html#step-1-split-left-align-and-trim-variants 12
cadd_raw" \ --filter "impact_severity != 'LOW' \ and ((aaf_esp_ea <= 0.01 or aaf_esp_ea is NULL) \ and (aaf_exac_all <= 0.01 or aaf_exac_all is NULL))” \ trio.trim.vep.recessive.db \ | awk '$14==1' \ | wc -l Use ESP and ExAC to focus on rare variants 8 lines, 4 comp_hets Note: copy and paste the full command from the Github Gist 18
of families required to have a variant in the same gene in order for it to be reported. For example, we may only be interested in candidates where at least 2 families have a variant in that gene. 24
to eliminate less confident genotypes, it is possible to enforce a maximum PL value for each sample. On this scale, lower values indicate more confidence that the called genotype is correct. 10 is a reasonable value: What is the “PL”? https://samtools.github.io/hts-specs/VCFv4.2.pdf What is a “Phred scaled” genotype likelihood? 26
genotype likelihood? Example calculation based on the GATK HaplotypeCaller http://gatkforums.broadinstitute.org/discussion/5913/math-notes-how-pl-is-calculated-in-haplotypecaller 27
region(s) (Example commands) 1. Tabix a BED file with the observed homozygosity regions gemini annotate -f homoz_region.bed.gz \ –c homoz_region \ -t boolean \ AR.db 2. Use the annotate tool to flag variants that overlap these regions. gemini autosomal_recessive AR.db --columns "chrom, start, end, ref, alt, filter, qual, gene, impact, aaf_esp_ea, aaf_1kg_eur” -–filter "filter is NULL and aaf_esp_ea < 0.1 and (impact_severity = 'HIGH' or impact_severity = 'MED') and region ==1” 3. Filter variants for those that overlap these regions. 29
exome data. Density of markers. 2. Shorter runs of homozygosity happen often by chance. 3. Density of homozygotes is important. http://www.nature.com/nature/journal/v449/n7164/extref/nature06258-s1.pdf 33