• ࡉ๔छಛҟੑ͋Δ͔ʁ • εϓϥΠγϯάͷଟܕͷӨڹͷଌఆ • HeterozygoteͳҨࢠͷෆ׆ੑԽҨ ࢠྔิঈͷϞσϧʹద߹͠ͳ͔ͬͨ • ຊจ4ϖʔδɺαϓϦ81ϖʔδ • citation 93 Protein truncating variants (PTVs) —Genetic variants predicted to shorten the coding sequence of genes GTEx + Genetic European variation in Health and Disease
have major phenotypic consequences? • in spite of PTV abundance in healthy individuals • In most cases their precise molecular effect has not been characterized and in other cases show gain-of-function effects avg. per individual Subjects of variants defined as PTV
Ref. Assessing allele-specific… • Compare transcripts between the PTV and the non-PTV alleles within the same individual • Validated by mmPCR-seq (microfluidics-based multiplex PCR and deep sequencing) Rui Zhang, et al. 2014. • Ҩࢠྔิঈ (Dosage compensation) • Focus only on biallelic whole-gene deletions with strong experimental support and manual curation • ΤΩιϯδϟϯΫγϣϯͷಛఆ • Assess variants influences on splicing disruption
et al Bioinformatics 2013.) • GTM* model to compute the statistics (Assessing allele-specific…ࢀর) • Splice disruption: MCMC fitting to model (shown below) • NMD: random forest individual (38 sequence and genomic features) π=proportion of the PTVs belonging to splice disruption group γk=normal or disruption in k N(0, 1) for general population Let yk be the standardized splice junction quantification value of the PTV carrier k Pirinen, et al. 2015 Normal, moderate, strong -> five states -> prior by Dirichlet
codon is in the last exon -or- 2) the resulting premature termination codon is in the last 50 nucleotides in the second to last exon http://sift.bii.a-star.edu.sg/www/indels_help.html?mybuild=hs38
• ASE-based annotation (no NMD=escape and others=trigger) • ༧ଌ͞ΕͨNMDΛ༠ಋ͢Δكͳมҟ (n = 287) 17.9% hetero vs 8.1% specific Heterogeneous: heterogenous effects for each tissue type Tissue-specific: specific to a single tissue 30.5% 48.3% 3.3% 38 features including 50-bp rule
• the nonsense variant rs149244943 in gene PHKB (phosphorylase kinase, beta) • F: classified as having moderate ASE across all tissues • the nonsense variant rs119455955, a disease mutation for recessive late- infantile neuronal ceroid lipofuscinosis in gene TPP1 (tripeptidyl peptidase I) "4&UJTTVFTQFDJpDJUZʢ࣮ྫʣ
splicing —Variation around splice junctions tends to be rare (minor allele frequency <= 0.01) Reported in clinvar Swedish exome sequencing Median genomic evolutionary rate Branch site SDM ratio Proportion of variants inducing SDM Other evidences
human genes are haplosufficient • -> homeostatic mechanisms (possibly as proposed in the theory of dominance) in heterozygous and inactivation in homozygous • Larger data sets will be required to increase our power to predict molecular consequences of variants from sequence data alone • personal transcriptomics will become an important complement to genome analysis. • MAMBA contains many methods are shared (e.g. NMD predictor, splicing disruption model) ! %JTDVTTJPO