Excoffier L, Smouse P E
Department of Anthropology and Ecology, University of Geneva, Carouge, Switzerland.
Genetics. 1994 Jan;136(1):343-59. doi: 10.1093/genetics/136.1.343.
We formalize the use of allele frequency and geographic information for the construction of gene trees at the intraspecific level and extend the concept of evolutionary parsimony to molecular variance parsimony. The central principle is to consider a particular gene tree as a variable to be optimized in the estimation of a given population statistic. We propose three population statistics that are related to variance components and that are explicit functions of phylogenetic information. The methodology is applied in the context of minimum spanning trees (MSTs) and human mitochondrial DNA restriction data, but could be extended to accommodate other tree-making procedures, as well as other data types. We pursue optimal trees by heuristic optimization over a search space of more than 1.29 billion MSTs. This very large number of equally parsimonious trees underlines the lack of resolution of conventional parsimony procedures. This lack of resolution is highlighted by the observation that equally parsimonious trees yield very different estimates of population genetic diversity and genetic structure, as shown by null distributions of the population statistics, obtained by evaluation of 10,000 random MSTs. We propose a non-parametric test for the similarity between any two trees, based on the distribution of a weighted coevolutionary correlation. The ability to test for tree relatedness leads to the definition of a class of solutions instead of a single solution. Members of the class share virtually all of the critical internal structure of the tree but differ in the placement of singleton branch tips.
我们将等位基因频率和地理信息的使用形式化,用于构建种内水平的基因树,并将进化简约性的概念扩展到分子方差简约性。核心原则是将特定的基因树视为在估计给定群体统计量时要优化的变量。我们提出了三种与方差成分相关且是系统发育信息显式函数的群体统计量。该方法应用于最小生成树(MST)和人类线粒体DNA限制数据的背景下,但可以扩展以适应其他建树程序以及其他数据类型。我们通过对超过12.9亿个MST的搜索空间进行启发式优化来寻找最优树。如此大量的同等简约树凸显了传统简约程序缺乏分辨率。通过观察同等简约树会产生非常不同的群体遗传多样性和遗传结构估计值,这一缺乏分辨率的情况得到了突出体现,如通过评估10000个随机MST获得的群体统计量的零分布所示。我们基于加权共进化相关性的分布,提出了一种用于检验任意两棵树之间相似性的非参数检验。检验树相关性的能力导致定义了一类解而不是单个解。该类别的成员几乎共享了树的所有关键内部结构,但在单枝末端的位置上有所不同。