gi|5821183 A: gi|5821186 E: gi|199601522 E: gi|163247538 A: gi|5821190 B: gi|13122391 E: gi|13488777 A: gi|5821181 A: gi|5821195 E: gi|1165232 B: gi|710315 0.2 B: gi|13122400 B: gi|13122405 E: gi|2660560 B: gi|13122402 A: gi|5821183 A: gi|5821186 E: gi|199601522 E: gi|163247538 A: gi|5821190 B: gi|13122391 E: gi|13488777 A: gi|5821181 A: gi|5821195 E: gi|1165232 B: gi|710315 Rooted trees: includes an assump2on about the last common ancestor of all sequences Unrooted trees: no assump2on about the last common ancestor of all sequences Trees are oRen built from gene sequences, and thus represent gene trees. If the genes are orthologous, this can also represent a species tree.
at a subset of the possible trees, and don’t guarantee to ﬁnd the best tree. • Designed to scale to trees for many OTUs (how well they scale depends on the method, and there is a lot of variability) • ORen provide a single tree, so do not include informa2on on how likely other tree topologies are (we’ll talk about methods, such as bootstrapping, to address this).
(non-‐nega2vity) – d(x,y) = 0 if and only if x = y (iden2ty of indiscernibles) – d(x,y) = d(y,x) (symmetry) – d(x,z) <= d(x,y) + d(y,z) (triangle inequality) hhp://www-‐history.mcs.st-‐and.ac.uk/~john/MT4522/Lectures/L5.html hhp://en.wikipedia.org/wiki/Metric_(mathema2cs)
Y (20.0) x y z x 0 14 4 y 14 0 18 z 4 18 0 Distance: – d(x,y) >= 0 (non-‐nega2vity) – d(x,y) = 0 if and only if x = y (iden2ty of indiscernibles) – d(x,y) = d(y,x) (symmetry) – d(x,z) <= d(x,y) + d(y,z) (triangle inequality)
Unweighted: all 2p-‐to-‐2p distances contribute equally • Pair-‐group: all branch points lead to exactly two clades • Arithme2c mean: distances to each clade are the mean of distances to all members of that clade hhp://www.southampton.ac.uk/~re1u06/teaching/upgma/
the matrix and create a new group containing only those members. Step 2: Create a new distance matrix with an entry represen2ng the clade created in step 1. Calcula2ng the mean distance from each of the 2ps of the new clade to all other 2ps in the distance matrix. Step 3: If there is only one distance in the distance matrix, stop. Otherwise repeat step 1.
compiled while reviewing the following sources: • The Phylogene&c Handbook (Lemey, Salemi, Vandamme) • Inferring Phylogeny (Felsenstein) • Richard Edwards’s teaching website: hhp://www.southampton.ac.uk/~re1u06/teaching/upgma/
United States License. To view a copy of this license, visit hhp://crea2vecommons.org/licenses/by/3.0/us/ or send a leher to Crea2ve Commons, 171 Second Street, Suite 300, San Francisco, California, 94105, USA. Feel free to use or modify these slides, but please credit me by placing the following ahribu2on informa2on where you feel that it makes sense: Greg Caporaso, www.caporaso.us.