Bandelt H J, Forster P, Röhl A
Mathematisches Seminar, Universität Hamburg, Germany.
Mol Biol Evol. 1999 Jan;16(1):37-48. doi: 10.1093/oxfordjournals.molbev.a026036.
Reconstructing phylogenies from intraspecific data (such as human mitochondrial DNA variation) is often a challenging task because of large sample sizes and small genetic distances between individuals. The resulting multitude of plausible trees is best expressed by a network which displays alternative potential evolutionary paths in the form of cycles. We present a method ("median joining" [MJ]) for constructing networks from recombination-free population data that combines features of Kruskal's algorithm for finding minimum spanning trees by favoring short connections, and Farris's maximum-parsimony (MP) heuristic algorithm, which sequentially adds new vertices called "median vectors", except that our MJ method does not resolve ties. The MJ method is hence closely related to the earlier approach of Foulds, Hendy, and Penny for estimating MP trees but can be adjusted to the level of homoplasy by setting a parameter epsilon. Unlike our earlier reduced median (RM) network method, MJ is applicable to multistate characters (e.g., amino acid sequences). An additional feature is the speed of the implemented algorithm: a sample of 800 worldwide mtDNA hypervariable segment I sequences requires less than 3 h on a Pentium 120 PC. The MJ method is demonstrated on a Tibetan mitochondrial DNA RFLP data set.
从种内数据(如人类线粒体DNA变异)重建系统发育树通常是一项具有挑战性的任务,因为样本量较大且个体间的遗传距离较小。由此产生的众多合理树状图最好用网络来表示,该网络以循环的形式展示替代的潜在进化路径。我们提出了一种方法(“中位数连接”[MJ]),用于从无重组群体数据构建网络,该方法结合了Kruskal算法(通过倾向于短连接来寻找最小生成树)和Farris的最大简约(MP)启发式算法的特点,后者依次添加称为“中位数向量”的新顶点,不同的是我们的MJ方法不解决平局情况。因此,MJ方法与Foulds、Hendy和Penny早期估计MP树的方法密切相关,但可以通过设置参数ε来调整到同塑性水平。与我们早期的简约中位数(RM)网络方法不同,MJ适用于多状态特征(如氨基酸序列)。另一个特点是所实现算法的速度:在奔腾120个人计算机上,对800个全球线粒体DNA高变区I序列的样本进行分析所需时间不到3小时。在一个藏族线粒体DNA RFLP数据集上展示了MJ方法。