评估 AFLP 数据集内进化分歧与系统发育准确性之间的关系。
Evaluating the relationship between evolutionary divergence and phylogenetic accuracy in AFLP data sets.
机构信息
Departamento de Bioquímica, Genética e Inmunología, Facultad de Biología, Universidade de Vigo, Vigo, Spain.
出版信息
Mol Biol Evol. 2010 May;27(5):988-1000. doi: 10.1093/molbev/msp315. Epub 2009 Dec 21.
Using in silico amplified fragment length polymorphism (AFLP) fingerprints, we explore the relationship between sequence similarity and phylogeny accuracy to test when, in terms of genetic divergence, the quality of AFLP data becomes too low to be informative for a reliable phylogenetic reconstruction. We generated DNA sequences with known phylogenies using balanced and unbalanced trees with recent, uniform and ancient radiations, and average branch lengths (from the most internal node to the tip) ranging from 0.02 to 0.4 substitutions per site. The resulting sequences were used to emulate the AFLP procedure. Trees were estimated by maximum parsimony (MP), neighbor-joining (NJ), and minimum evolution (ME) methods from both DNA sequences and virtual AFLP fingerprints. The estimated trees were compared with the reference trees using a score that measures overall differences in both topology and relative branch length. As expected, the accuracy of AFLP-based phylogenies decreased dramatically in the more divergent data sets. Above a divergence of approximately 0.05, AFLP-based phylogenies were largely inaccurate irrespective of the distinct topology, radiation model, or phylogenetic method used. This value represents an upper bound of expected tree accuracy for data sets with a simple divergence history; AFLP data sets with a similar divergence but with unbalanced topologies and short ancestral branches produced much less accurate trees. The lack of homology of AFLP bands quickly increases with divergence and reaches its maximum value (100%) at a divergence of only 0.4. Low guanine-cytosine (GC) contents increase the number of nonhomologous bands in AFLP data sets and lead to less reliable trees. However, the effect of the lack of band homology on tree accuracy is surprisingly small relative to the negative impact due to the low information content of AFLP characters. Tree-building methods based on genetic distance displayed similar trends and outperformed parsimony at low but not at high divergences. However, the impact of using alternative phylogenetic methods on tree accuracy was generally small relative to the uncertainty arising from factors such as divergence, nonhomology of bands, or the low information content of AFLP characters. Nevertheless, our data suggest that under certain circumstances, AFLPs may be suitable to reconstruct deeper phylogenies than usually accepted.
利用计算机模拟扩增片段长度多态性(AFLP)指纹图谱,我们探讨了序列相似性与系统发育准确性之间的关系,以确定在遗传分歧程度上,AFLP 数据的质量变得太低,无法为可靠的系统发育重建提供信息。我们使用具有近期、均匀和古老辐射的平衡和不平衡树以及平均分支长度(从最内部节点到尖端)范围为 0.02 到 0.4 个替代/位点,生成了具有已知系统发育的 DNA 序列。所得序列用于模拟 AFLP 过程。使用最大简约法(MP)、邻接法(NJ)和最小进化法(ME)从 DNA 序列和虚拟 AFLP 指纹图谱中估计树。使用衡量拓扑结构和相对分支长度总体差异的得分来比较估计树与参考树。正如预期的那样,在更具分歧的数据集上,基于 AFLP 的系统发育的准确性急剧下降。在大约 0.05 的分歧以上,无论使用的拓扑结构、辐射模型或系统发育方法如何,基于 AFLP 的系统发育都很大程度上不准确。该值代表具有简单分歧历史的数据集中预期树准确性的上限;具有类似分歧但不平衡拓扑结构和短祖先分支的 AFLP 数据集产生的树准确性要低得多。随着分歧的增加,AFLP 条带的同源性迅速增加,在仅为 0.4 的分歧下达到最大值(100%)。低鸟嘌呤-胞嘧啶(GC)含量增加了 AFLP 数据集中非同源带的数量,导致树的可靠性降低。然而,与 AFLP 特征信息量低所带来的负面影响相比,缺乏带同源性对树准确性的影响非常小。基于遗传距离的树构建方法表现出类似的趋势,在低分歧度下优于简约法,但在高分歧度下则不然。然而,相对于由于分歧、带非同源性或 AFLP 特征信息量低等因素引起的不确定性,使用替代系统发育方法对树准确性的影响通常较小。尽管如此,我们的数据表明,在某些情况下,AFLP 可能适合重建比通常接受的更深的系统发育。