Wong Wendy S W, Yang Ziheng, Goldman Nick, Nielsen Rasmus
Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York 14850, USA.
Genetics. 2004 Oct;168(2):1041-51. doi: 10.1534/genetics.104.031153.
The parsimony method of Suzuki and Gojobori (1999) and the maximum likelihood method developed from the work of Nielsen and Yang (1998) are two widely used methods for detecting positive selection in homologous protein coding sequences. Both methods consider an excess of nonsynonymous (replacement) substitutions as evidence for positive selection. Previously published simulation studies comparing the performance of the two methods show contradictory results. Here we conduct a more thorough simulation study to cover and extend the parameter space used in previous studies. We also reanalyzed an HLA data set that was previously proposed to cause problems when analyzed using the maximum likelihood method. Our new simulations and a reanalysis of the HLA data demonstrate that the maximum likelihood method has good power and accuracy in detecting positive selection over a wide range of parameter values. Previous studies reporting poor performance of the method appear to be due to numerical problems in the optimization algorithms and did not reflect the true performance of the method. The parsimony method has a very low rate of false positives but very little power for detecting positive selection or identifying positively selected sites.
铃木和五条博(1999年)提出的简约法以及尼尔森和杨(1998年)研究成果发展而来的最大似然法,是检测同源蛋白质编码序列中正向选择的两种广泛使用的方法。两种方法都将过量的非同义(替换)替换视为正向选择的证据。先前发表的比较这两种方法性能的模拟研究显示出相互矛盾的结果。在此,我们进行了更全面的模拟研究,以涵盖并扩展先前研究中使用的参数空间。我们还重新分析了一个先前提出的使用最大似然法分析时会产生问题的HLA数据集。我们新的模拟以及对HLA数据的重新分析表明,最大似然法在广泛的参数值范围内检测正向选择时具有良好的功效和准确性。先前报告该方法性能不佳的研究似乎是由于优化算法中的数值问题,并未反映该方法的真实性能。简约法的假阳性率非常低,但检测正向选择或识别正向选择位点的能力非常有限。