Andrieu G, Caraux G, Gascuel O
Département d'Informatique Fondamentale, L.I.R.M.M., Montpellier, France.
Mol Biol Evol. 1997 Aug;14(8):875-82. doi: 10.1093/oxfordjournals.molbev.a025829.
Two methods are commonly employed for evaluating the extent of the uncertainty of evolutionary distances between sequences: either some estimator of the variance of the distance estimator, or the bootstrap method. However, both approaches can be misleading, particularly when the evolutionary distance is small. We propose using another statistical method which does not have the same defect: interval estimation. We show how confidence intervals may be constructed for the Jukes and Cantor (1969) and Kimura two-parameter (1980) estimators. We compare the exact confidence intervals thus obtained with the approximate intervals derived by the two previous methods, using artificial and biological data. The results show that the usual methods clearly underestimate the variability when the substitution rate is low and when sequences are short. Moreover, our analysis suggests that similar results may be expected for other evolutionary distance estimators.
一是距离估计量方差的某种估计方法,二是自展法。然而,这两种方法都可能产生误导,尤其是当进化距离较小时。我们建议使用另一种没有同样缺陷的统计方法:区间估计。我们展示了如何为朱克斯和坎托(1969年)以及木村双参数(1980年)估计量构建置信区间。我们使用人工数据和生物数据,将由此获得的精确置信区间与通过前两种方法得出的近似区间进行比较。结果表明,当替换率较低且序列较短时,常用方法明显低估了变异性。此外,我们的分析表明,对于其他进化距离估计量可能会得到类似的结果。