An introduction to simple and complex traits in humans, and how to study them in mice.
This slide deck describes the difference between simple and complex genetic traits in humans, how complex traits are studied in humans, and especially how we study them in laboratory mice.
Questions to Answer • What are the differences between simple and complex traits? • How do we study the genetic basis of complex traits in humans? • Why and how do we study complex traits in mice? • What is “21st Century Mouse Genetics”?
What is a trait? Trait - n. a distinguishing feature of your personal nature. In science, trait refers to a characteristic that is caused by genetics. A disease can be considered a trait. Traits can be classified by their inheritance pattern. Simple trait – Arises from mutations in a single gene. = “Mendelian” trait = “Binary” trait Complex trait – Affected by many genes. = “Quantitative” trait = “Multifactorial” trait
Proportion Most human traits are not binary and simple, but rather continuous and complex. These quantitative traits derive from the interplay of many genes (“polygenic”) and the environment.
Your height is determined by a complex interplay of your genetics and your environment. Genome + Environment Let’s dig into genetic networks a little more
But what does that tell us? The Arby’s example. I originally made this example for students at Bates College in Lewiston, Maine. The scenario is this: You are really craving an Arby’s Beef ‘n Cheddar sandwich, but you don’t know where the Arby’s is located in Lewiston-Auburn, and you don’t have a car. How could you figure out where the Arby’s is located using the Citylink bus routes (and without seeing the actual Arby’s restaurant)?
If you can measure it, you can run a GWAS on it. But buyer beware (of confounding variables). None of these genes/variants are known to be associated with food or alcohol intake. But in the UK Biobank, this trait is positively correlated with "Average total household income before tax" and inversely correlated with "Job involves heavy manual or physical work”.
Type 2 Diabetes is a complex disease caused by genetic and environmental factors. GWAS has identified hundreds of genetic variants associated with T2D. The vast majority are found in non-coding regions of the genome.
So we can use GWAS to identify genetic variants associated with T2D. Maternal Diet Air/Water Quality Stress Nutrition Pregnancy Length Drug Use But how do we go from genes → mechanisms → therapeutics?
Humans are terrible genetic models Novembre et al 2008 • Population stratification can cause false positives in association studies. • They take too long to breed, and live way too long. • Environmental variance can mask genetic influence. • Many adult traits originate during development.
Emerging themes Most traits and diseases are complex and polygenic. No gene is an island - The function of any gene depends on the genetic background it is a part of. - The effects of a mutation in one gene may be amplified or buffered (ie, modified) by variation in another gene (modifier). - A disease-causing mutation may lead to severe pathology in some people but mild or no pathology in others. We can identify these modifier variants in humans using Genome-Wide Association Studies. We’re limited to what we can study (and more specifically, how we can study it) in humans themselves.
Mice are exceptional animal models • Physiologically and anatomically similar to humans. • Breed early and often. Access to tissues at all stages of development and adulthood. • Can be inbred and genetically modified. • Highly characterized (genome sequence, curated databases) • Can be used to map modifiers and build networks. • No, they are not little humans, but…
What is an inbred strain? • Genetically identical • Animals that result from the process of brother-sister mating for at least 20 sequential generations
Consequences of Inbreeding Up to 20 Generations Silver 1995 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% 0 5 10 15 20 Generations of inbreeding Individual homozygosity at current generation Portion of genome that is fixed between two breeding sibs chosen for producing next generation 98.6%
Two-parent QTL Crosses • Easy to map significant QTLs. Straightforward analysis. • Fewer mice required. – Even recessive alleles will be homozygous in ¼ of progeny. • All mice need to be genotyped ($). • Number of recombination events per mouse is low. Less recombination = lower mapping resolution = more mice ($). • Confidence intervals tend to be broad and resolving the causative gene may require heroic follow-up experiments.
Variations on a theme: Recombinant Inbred Lines (RILs) Rob Williams Ashbrook et al. 2019 BXD Lines Genenetwork.org Each BXD line is inbred, replicable, and fully genotyped, and is associated with a lot of historic data
Diversity Outbred and Collaborative Cross mice Powerful orthogonal resources for gene discovery and validation • Balanced population structure • 400+ recombinations per animal • High heterozygosity • Each animal is unique 5 common lab + 3 wild-derived strains 52 million+ SNVs, 2 million+ indels • Reproducible genomes • High genetic diversity • Fewer recombinations per line
The Collaborative Cross (CC) are replicable, diverse inbred strains for disease research and genetic mapping. *CC strains can also be used to validate observations/predictions from the DO.
Balanced allele frequencies in the DO Svenson et.al. Genetics, 2011 A/J C57BL/6J 129S1/SvImJ NOD/ShiLtJ NZO/LtJ CAST/EiJ PWK/PhJ WSB/EiJ *Nearly every gene in the genome has genetic variants segregating in the DO that are potentially functional
50 40 30 20 10 Body weight (gm) 7/11/2014 7/31/2014 8/20/2014 date 50 40 30 20 10 Body weight (gm) 7/11/2014 7/31/2014 8/20/2014 date female DO mice male DO mice DO mice are genetically and phenotypically diverse Alan Attie & Mark Keller Female DO mice Male DO mice
None of the founder inbred strains live past 2.5 years. How is it possible that mixing their genomes up can result in a mouse that lives almost twice as long?
DO Genomes: 28 possible heterozygous and 8 homozygous genotype states at every locus. A/J A BL6 B 129 C NOD D NZO E CAST F PWK G WSB H 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 A A A A A A A A B B B B B B B C C C C C C D D D D D E E E E F F F G G H A B C D E F G H B C D E F G H C D E F G H D E F G H E F G H F G H G H H Founder strains – 8 possible genotypes Diversity Outbred – 36 Possible Genotype states
Genetic Mapping in DO Mice Genotyping Arrays Hidden Markov Model Genome Reconstruction i i k k ik c j j ij i g x y e g b a + + + = å å = = 8 1 1 Linkage and Association:
Benzene Study Endpoints Pre- and post-exposure blood Post-exposure bone marrow Proportion of micronucleated reticulocytes (MN-RET) o Measure of chromosomal damage French, et al., Environmental Health Perspectives, 2015
The founder origin of each allele is tagged and provides direct estimates of allelic abundance. The local eQTL for the lincRNA Gm12976 is cis-acting. DO samples N=277 N=554 allele-specific estimates.
Unique power + unique challenges Multi-parent, multi-generation crosses like the DO offer high genetic diversity (45M SNPs) and fine recombination block structure. Increased complexity requires specialized methods for haplotype reconstruction and mapping. QTL confidence intervals can be very small, but require more samples for mapping. Founder sequences can help to identify causal variants. Word of caution: If your phenotype is affected by many variants with small effects segregating in the DO, you will need ++++ mice to map them.
Summary • Most common diseases are polygenic and stem from complex interactions between one’s genetic background and their environment. • No gene is an island. Genes interact within networks and pathways. • We can apply genetic mapping in human cohorts to identify risk variants associated with complex traits/diseases. • We can leverage the power of the mouse model and emerging diversity resources to refine and expand our understanding of complex human diseases.
Questions to Answer • What are the differences between simple and complex traits? • How do we study the genetic basis of complex traits in humans? • Why and how do we study complex traits in mice? • What is “21st Century Mouse Genetics”?