Using natural genetic diversity to discover protein regulatory networks

Using natural genetic diversity to discover protein regulatory networks

Slides from my seminar at the Vanderbilt Genetics Institute on 6-21-2017

8e4bf6269bc939dfd942996af10e070a?s=128

Steve Munger

June 21, 2017
Tweet

Transcript

  1. Harnessing genetic diversity to discover protein regulatory networks Steve Munger

    The Jackson Laboratory
  2. How does genetic variation affect transcript and protein abundance? DNA

    RNA Protein DNA RNA Protein Transcrip)on Transla)on DNA RNA Protein Francis Crick 1956
  3. Nature July 2013 “We estimate that approximately one-half of pQTLs

    are probably also eQTLs. However, many pQTLs do not correspond to eQTLs, even at a relaxed stringency.” Accumulating evidence of a disconnect between transcript and protein expression. September 2014 February 2015 “QTLs affecting mRNA levels are, on average, attenuated or buffered at the protein level…
  4. “Next Generation” genetic models: The founder strains of the mouse

    Diversity Outbred stock CAST 129S1 WSB NZO A/J B6 PWK NOD
  5. Diversity Outbred (DO) Heterogeneous Stock

  6. Diversity Outbred (DO) mice: A reservoir of natural genetic perturbations.

    -  45M+ SNPS -  2M+ indels -  Balanced popula)on structure -  Each individual unique -  400+ recombina)ons in each animal - High heterozygosity
  7. 50 40 30 20 10 Body weight (gm) 7/11/2014 7/31/2014

    8/20/2014 date 50 40 30 20 10 Body weight (gm) 7/11/2014 7/31/2014 8/20/2014 date female DO mice male DO mice DO mice are genetically and phenotypically diverse Alan Attie & Mark Keller Female DO mice Male DO mice
  8. Diversity Outbred mice exhibit phenotypes far exceeding the range observed

    in the founder strains.
  9. 192 DO Livers Transcripts Short Reads RNA-Seq eQTL pQTL eQTL

    Mapping pQTL Mapping Proteins Peptides MS/MS Compare ? Munger et al. 2014 Chick*, Munger* et al. Nature, 2016 How does genetic variation influence transcript and protein abundance?
  10. Challenge: Every mouse is a unique diploid combination of 10M+

    SNPs and 500K+ indels.
  11. Munger et al. 2014 GaO et al. 2014 Construc)ng individualized

    diploid transcriptomes for RNA-seq alignment with Seqnature.
  12. Seqnature Munger et al. 2014 GaO et al. 2014 Al

    Simons Narayanan Raghupathy Kwangbom Choi Dan GaO
  13. Every DO sample will have a unique gene set that

    is sensi)ve to alignment errors from reference alignment…
  14. Analysis Pipeline ~ 30 million SE 100bp reads Yfg 1.

    Align reads to transcriptome. Yfg Yfg Yfg Mouse 1 Mouse 2 Mouse 3 x 272 mice RSEM (Li and Dewey 2010) 2. Es)mate gene and isoform expression. 3. Map expression QTL
  15. Alignment to individualized transcriptomes results in fewer spurious liver eQTL.

    Rps12-ps2 Aligned to NCBIm37 Aligned to DO IRGs
  16. Hebp1 Aligned to NCBIM37 Aligned to DO IRGs Alignment to

    individualized transcriptomes reveals significant local eQTLs for 2,000+ genes.
  17. Munger et al. 2014 Are these unmasked local eQTLs real?

    Yes. CC/DO Founder Strain samples
  18. The founder origin of each allele provides direct es)mates of

    allele specific expression. Only alleles derived from 129S1 express Gm12976 in the DO popula)on.
  19. Allele specific expression is the rule rather than the excep)on

    in gene)cally diverse individuals.
  20. The DO is a reservoir of gene)c perturba)ons. ~75% of

    genes have eQTL. Gene Location eQTL Location 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 1819 X 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 X
  21. 192 DO Livers Transcripts Short Reads RNA-Seq eQTL pQTL eQTL

    Mapping pQTL Mapping Proteins Peptides MS/MS Compare ? Munger et al. 2014 Chick*, Munger* et al. Nature 2016 How does genetic variation affect protein abundance?
  22. An unprecedented view of protein regulation. 2,866 pQTL detected for

    2,552 proteins. Total eQTL pQTL 2306 1152 1400 N=6707 p < 2.2e-16 FDR < 0.1
  23. 80% of proteins with local pQTL have concordant local eQTL

    Local eQTL pQTL 1819 344 1392 QTL RNA Protein cis cis
  24. 20% of proteins with local pQTL lack concordant local eQTL

    Local eQTL pQTL 1819 344 1392 QTL RNA Protein cis
  25. 25% of expressed proteins appear buffered from local transcriptional variation

    Local eQTL pQTL 1819 344 1392 QTL RNA Protein cis
  26. 1,130 distant pQTL indicate extensive trans regulation of protein abundance.

  27. Only 9 out of 1130 distant pQTL have concordant distant

    eQTL. Distant eQTL pQTL 915 1039 9 cis RNA Protein QTL trans FDR < 0.1
  28. What post-transcriptional mechanism is acting in trans to control these

    proteins? Distant eQTL pQTL 915 1039 9 RNA Protein QTL trans
  29. Searching for protein and transcript mediators of distant pQTL –>

    Mediation Analysis RNA Protein QTL trans cis RNA Protein Target Causal Intermediates RNA Protein trans QTL cis Target Target Protein ~ pQTLdistant Target Protein ~ pQTLdistant + MediatorProtein x 8000 proteins Target Protein ~ pQTLdistant + MediatorRNA x 21000 Transcripts X
  30. Mediation analysis reveals causal intermediates. pQTLD Tmem68 TMEM68 trans 13

    Target 3 cis
  31. Tmem68 TMEM68 trans 13 cis Target 3 cis cis Nnt

    NNT Mediation analysis reveals causal intermediates.
  32. 43,102 SNPs in region 3 Candidate SNPs 1 Short Deletion

    Nnt eQTL B6 alleles do not express Nnt Low abundance of NNT in C57BL/6J drives low abundance of TMEM68.
  33. We re-discovered a known deficiency in C57BL/6J and assigned Tmem68

    to a pathway. Free Radical Biology and Medicine 2013
  34. TMEM68-NNT Next Step: Valida)on Transcript Abundance Protein Abundance Tmem68 TMEM68

    trans 13 cis 3 cis cis Nnt NNT Alex Stanton – Tums predoc Tmem68 TMEM68 3 cis Ques)on 1: Where does the disconnect between Tmem68 transcript and protein occur? At the level of transla)on, or by post-transla)onal mechanisms? ?
  35. Gene)c varia)on may affect transla)on or post- transla)onal mechanisms. Tmem68

    TMEM68 TMEM68 Transla.on Total RNA “Steady state” Protein Translated Protein Post-Transla.on Folding Stability Post-transla)onal mods Phosphoryla)on Acetyla)on Ubiqui)na)on Methyla)on Localiza)on/Exporta)on Transla)on Efficiency Transla)on pausing Alterna)ve ORF
  36. Brar and Weissman 2015 Not all transcripts are translated equally

  37. Protein complex members are tightly coregulated, with one member adopting

    the “regulatory” role. Chaperonin containing TCP1 complex
  38. CCT2 Mediation Analysis

  39. Cct6a Low expression of Cct6a in NOD/ShiLtJ Drives low expression

    of CCT complex
  40. CCT2 CCT3 CCT5 CCT6A CCT4 CCT7 CCT8 TCP1 TCP1 TCP1

    TCP1 TCP1 TCP1 TCP1 TCP1 CCT2 CCT2 CCT2 CCT2 CCT2 CCT2 CCT2 CCT2 CCT2 CCT3 CCT3 CCT3 CCT3 CCT3 CCT3 CCT4 CCT4 CCT4 CCT4 CCT4 CCT4 CCT4 CCT4 CCT4 CCT4 CCT5 CCT5 CCT5 CCT5 CCT5 CCT5 CCT6A CCT6A CCT7 CCT7 CCT7 CCT7 CCT7 CCT7 CCT7 CCT7 CCT8 CCT8 CCT8 CCT8 CCT2 CCT3 CCT5 CCT6A CCT4 CCT7 CCT8 TCP1 Stable CCT2 CCT3 CCT5 CCT4 CCT7 CCT8 TCP1 Stoichiometric buffering of protein abundance
  41. Stoichiometric buffering of CCT complex: Next steps •  Observa)on: DO

    animals with NOD allele at Cct6a have lower protein abundance of all CCT members. •  NOD muta)on affects both Cct6a transcript and protein levels. •  NOD has a transversion subs)tu)on in conserved KLF4 binding domain in Cct6a promoter. •  Luciferase assays to quan)fy effect of this subs)tu)on on promoter strength. •  CRISPR to “fix” muta)on in NOD and introduce same muta)on into B6. •  Test stoichiometric buffering hypothesis by introducing muta)on into other CCT member -> Can we transfer the regulatory role in the complex by knocking down the expression of another protein?
  42. One NOD private SNP (G>T Transversion) 200bp upstream of the

    TSS.
  43. Mediation identifies known and novel protein interactions

  44. Wash Kiaa1033 Trans Cis Kiaa0196 Fam21 Zw10 Vcp Ccdc43 llph

    Spg20 Fam45a Ccdc22 Gaa Atg16l1 Rufy1 Wash Complex Ccc Complex Ccdc93 Commd10 Commd9 9030624J02Rik Commd5 Commd7 Commd3 Commd2 Commd4 Dscr3 Cis Trans H2-Q10 Cis Trans Commd1 Pum1 1110004F10Rik Exocyst Complex Arp2/3 Complex Exoc6 Exoc2 Exoc7 Exoc8 Exoc5 Exoc4 Exoc1 Ttc39b Arpc3 Gckr Arpc5 Actr3 Arpc4 Arpc2 Actr2 Rala Coro1b Cis Cis Exoc3 Cis Cis eQTL pQTL Co-regulated Mediation reveals higher order protein networks Endosome
  45. Natural gene)c perturba)ons + Media)on analysis = Predic)ve Protein Network

  46. In Progress: Using genetic diversity to identify kinases for specific

    phosphorylation sites. Liver Phospho-Proteome Kinase <–> phospho site iden)fica)on by media)on
  47. Collaborative Cross strains can be used to validate predictions from

    the DO and build new models. CC001– 98% Homozygous
  48. Prediction of protein abundance in Founder and Collaborative Cross Strains.

  49. Looking ahead: Pathway-centered predictive genomics Example: Drug metabolism pathways are

    enriched for genes with significant liver pQTL. Tamoxifen
  50. Predict and test CC strain crosses that will produce progeny

    with compromised drug metabolism. CC Strain Cyp3a13 Cyp3a16 Cyp2d10 Cyp2d22 Fmo1 Fmo5 Predic@on CC001 ++ + - +++ - + Highest CC002 - + + - - + Medium CC003 - - - + + + Medium CC004 - + --- + -- - Lowest CC005 + -- ++ - + - Medium CC006 + - - - - + Low CC007 + - + - + + High Pathway-centered prediction Toy Example Test!
  51. Looking ahead: “Pulling the (gene)c) weeds” PRODH2 “Weeds” Step 1:

    Condi)on out the giant cis effect.
  52. “Pulling the weeds” to iden)fy subtle gene)c interac)ons Step 2:

    Mediate all subthreshold peaks LOD > 5. Aka “Pull the weeds”. Step 3: Repeat process for all 8k proteins. Step 4: Find proteins that share same subthreshold peak and mediator
  53. Lpgat1 Dync1h1 Actr1a Nup210 Arl6ip5 Dync1li2 Prodh2 Wbscr16 Etfa Ywhah

    Rock2 Ctsf Mcm2 Myo9b Ehd4 Mcm5 Frmd8 Mrps18c Maoa Ttll12 Actr1b Apba3 Lrrc59 Cdc34 Uaca Ptcd2 Dctn1 Tmem63a Nfkb2 Plcb3 Ap3b1 Rnf135 Ccdc6 Trap1 Yars2 Ggct Mtpap Dctn3 Diap1 Etnppl Cps1 Usp40 Aass Etfb Dctn2 Rbm25 Ptk2b Mme Mkl2 Ube2l3 Actr10 Ccdc91 Ebna1bp2 Cecr5 Tgfbrap1 Dync1i2 Lyrm4 Dctn4 Klc4 Sin3a Psmg1 Cpsf7 Clpp Slco2a1 Mapkapk2 Dync1li1 Smu1 Naga Scpep1 Step 5: Look for enriched annota)ons among list. Dynein complex, mitochondria, amine catabolism.
  54. Conclusions •  Most gene)c varia)on that affects transcript abundance does

    not affect protein abundance. –  For local gene)c varia)on that does affect protein abundance, 80% act proximally on transcrip)on (standard model). •  99+% of distant pQTL act on the target protein’s abundance independent of the target’s transcript abundance. •  Media)on analysis iden)fies 700 RNA/protein causal intermediates of distant pQTL and infers >5000 protein interac)ons. •  Stoichiometric buffering is a common post-transla)onal mechanism governing protein abundance of binding partners and complex members. CCT2 CCT3 CCT5 CCT6A CCT4 CCT7 CCT8 TCP1
  55. Acknowledgments

  56. None
  57. Thank you!

  58. Genetic variation affects protein abundance. Liver Kidney

  59. Liver Protein Location pQTL Location 1 2 3 4 5

    6 7 8 9 10 11 12 13 14 15 16 17 1819 X 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 X • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • Transcriptional mechanisms underlie most local pQTL. Over half of local pQTL affect both liver and kidney. QTL RNA Protein cis cis Kidney
  60. Example: DHTKD1– Cis pQTL liver, Cis pQTL kidney Liver Kidney

  61. Genetic variant(s) with conserved effects in liver & kidney. Founder

    strain coefficients at peak QTL SNP r = 0.99
  62. Example: LDHA – Cis pQTL liver, Cis pQTL kidney Kidney

    Liver
  63. Local variant(s) cause opposite tissue effects on LDHA expression. r

    = -0.88
  64. Example: MESDC2 – Cis pQTL liver, No pQTL kidney Liver

    Kidney
  65. No cis pQTL in kidney, but … Kidney founder coefficients

    match what we see in the liver. Effects are subtle, but there. r = 0.93
  66. Distant pQTL are abundant. Liver Kidney

  67. Post-transcriptional mechanisms underlie nearly all distant pQTL. Protein Location pQTL

    Location 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 1819 X 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 X • • • • • • • • • • • • Liver Kidney cis RNA Protein QTL trans
  68. Post-transcriptional mechanisms underlie distant pQTL. Almost no overlap between liver

    and kidney. RNA Protein QTL trans Liver Kidney
  69. Searching for protein and transcript mediators of distant pQTL –>

    Mediation Analysis RNA Protein QTL trans cis RNA Protein Target Causal Intermediates RNA Protein trans QTL cis Target Target Protein ~ pQTLdistant Target Protein ~ pQTLdistant + MediatorProtein x 8000 proteins Target Protein ~ pQTLdistant + MediatorRNA x 21000 Transcripts X
  70. Mediation analysis reveals causal intermediates. pQTLD Tmem68 TMEM68 trans 13

    Target 3 cis
  71. Tmem68 TMEM68 trans 13 cis Target 3 cis cis Nnt

    NNT Mediation analysis reveals causal intermediates.
  72. Slides can be downloaded at https://speakerdeck.com/stevemunger Lab website: mungerlab.com Twitter:

    @stevemunger