Xu S, Atchley W R, Fitch W M
Department of Genetics, North Carolina State University, Raleigh 27695-7614.
Mol Biol Evol. 1994 Nov;11(6):949-60. doi: 10.1093/oxfordjournals.molbev.a040175.
When pairwise genetic distances are used for phylogenetic reconstruction, it is usually assumed that the genetic distance between two taxa contains information about the time after the two taxa diverged. As a result, upon an appropriate transformation if necessary, the distance usually can be fitted to a linear model such that it is expressed as the sum of lengths of all branches that connect the two taxa in a given phylogeny. This kind of distance is referred to as "additive distance." For a phylogenetic tree exclusively driven by random genetic drift, genetic distances related to coancestry coefficients (theta XY) between any two taxa are more suitable. However, these distances are fundamentally different from the additive distance in that coancestry does not contain any information about the time after two taxa split from a common ancestral population; instead, it reflects the time before the two taxa diverged. In other words, the magnitude of theta XY provides information about how long the two taxa share the same evolutionary pathways. The fundamental difference between the two kinds of distances has led to a different algorithm of evaluating phylogenetic trees when theta XY and related distance measures are used. Here we present the new algorithm using the ordinary-least-squares approach but fitting to a different linear model. This treatment allows genetic variation within a taxon to be included in the model. Monte Carlo simulation for a rooted phylogeny of four taxa has verified the efficacy and consistency of the new method. Application of the method to human population was demonstrated.