用于检测蛋白质编码序列中适应性进化以及识别正选择位点的统计方法的准确性和功效。

Accuracy and power of statistical methods for detecting adaptive evolution in protein coding sequences and for identifying positively selected sites.

作者信息

Wong Wendy S W, Yang Ziheng, Goldman Nick, Nielsen Rasmus

机构信息

Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York 14850, USA.

出版信息

Genetics. 2004 Oct;168(2):1041-51. doi: 10.1534/genetics.104.031153.

DOI:10.1534/genetics.104.031153

PMID:15514074

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC1448811/

Abstract

The parsimony method of Suzuki and Gojobori (1999) and the maximum likelihood method developed from the work of Nielsen and Yang (1998) are two widely used methods for detecting positive selection in homologous protein coding sequences. Both methods consider an excess of nonsynonymous (replacement) substitutions as evidence for positive selection. Previously published simulation studies comparing the performance of the two methods show contradictory results. Here we conduct a more thorough simulation study to cover and extend the parameter space used in previous studies. We also reanalyzed an HLA data set that was previously proposed to cause problems when analyzed using the maximum likelihood method. Our new simulations and a reanalysis of the HLA data demonstrate that the maximum likelihood method has good power and accuracy in detecting positive selection over a wide range of parameter values. Previous studies reporting poor performance of the method appear to be due to numerical problems in the optimization algorithms and did not reflect the true performance of the method. The parsimony method has a very low rate of false positives but very little power for detecting positive selection or identifying positively selected sites.

摘要

铃木和五条博（1999年）提出的简约法以及尼尔森和杨（1998年）研究成果发展而来的最大似然法，是检测同源蛋白质编码序列中正向选择的两种广泛使用的方法。两种方法都将过量的非同义（替换）替换视为正向选择的证据。先前发表的比较这两种方法性能的模拟研究显示出相互矛盾的结果。在此，我们进行了更全面的模拟研究，以涵盖并扩展先前研究中使用的参数空间。我们还重新分析了一个先前提出的使用最大似然法分析时会产生问题的HLA数据集。我们新的模拟以及对HLA数据的重新分析表明，最大似然法在广泛的参数值范围内检测正向选择时具有良好的功效和准确性。先前报告该方法性能不佳的研究似乎是由于优化算法中的数值问题，并未反映该方法的真实性能。简约法的假阳性率非常低，但检测正向选择或识别正向选择位点的能力非常有限。

相似文献

Accuracy and power of statistical methods for detecting adaptive evolution in protein coding sequences and for identifying positively selected sites.用于检测蛋白质编码序列中适应性进化以及识别正选择位点的统计方法的准确性和功效。

Genetics. 2004 Oct;168(2):1041-51. doi: 10.1534/genetics.104.031153.

A method for detecting positive selection at single amino acid sites.一种检测单个氨基酸位点正选择的方法。

Mol Biol Evol. 1999 Oct;16(10):1315-28. doi: 10.1093/oxfordjournals.molbev.a026042.

New methods for detecting positive selection at single amino acid sites.检测单个氨基酸位点正选择的新方法。

J Mol Evol. 2004 Jul;59(1):11-9. doi: 10.1007/s00239-004-2599-6.

Statistical properties of the branch-site test of positive selection.分支位点检验的统计特性。

Mol Biol Evol. 2011 Mar;28(3):1217-28. doi: 10.1093/molbev/msq303. Epub 2010 Nov 18.

Inferring natural selection operating on conservative and radical substitution at single amino acid sites.推断作用于单个氨基酸位点保守和激进替换的自然选择。

Genes Genet Syst. 2007 Aug;82(4):341-60. doi: 10.1266/ggs.82.341.

Detecting amino acid sites under positive selection and purifying selection.检测处于正选择和纯化选择下的氨基酸位点。

Genetics. 2005 Mar;169(3):1753-62. doi: 10.1534/genetics.104.032144. Epub 2005 Jan 16.

The branch-site test of positive selection is surprisingly robust but lacks power under synonymous substitution saturation and variation in GC.正选择分支位点检验具有惊人的稳健性，但在同义替换饱和和 GC 变化下缺乏功效。

Mol Biol Evol. 2013 Jul;30(7):1675-86. doi: 10.1093/molbev/mst062. Epub 2013 Apr 4.

Reliabilities of parsimony-based and likelihood-based methods for detecting positive selection at single amino acid sites.基于简约法和似然法在单个氨基酸位点检测正选择的可靠性。

Mol Biol Evol. 2001 Dec;18(12):2179-85. doi: 10.1093/oxfordjournals.molbev.a003764.

Not so different after all: a comparison of methods for detecting amino acid sites under selection.终究并非如此不同：选择下氨基酸位点检测方法的比较

Mol Biol Evol. 2005 May;22(5):1208-22. doi: 10.1093/molbev/msi105. Epub 2005 Feb 9.

Detecting individual sites subject to episodic diversifying selection.检测易发性分歧选择的个体位点。

PLoS Genet. 2012;8(7):e1002764. doi: 10.1371/journal.pgen.1002764. Epub 2012 Jul 12.

引用本文的文献

Detecting Interspecific Positive Selection Using Convolutional Neural Networks.使用卷积神经网络检测种间正选择

Mol Biol Evol. 2025 Jul 1;42(7). doi: 10.1093/molbev/msaf154.

Myxozoan parasite genomes assembled from contaminated host data reveal extensive gene order conservation and rapid sequence evolution.从受污染的宿主数据中组装的粘孢子虫寄生虫基因组揭示了广泛的基因顺序保守性和快速的序列进化。

G3 (Bethesda). 2025 Jul 9;15(7). doi: 10.1093/g3journal/jkaf061.

Comparative Chloroplast Genomics Reveals Intrageneric Divergence in .比较叶绿体基因组学揭示了……属内的分化

Int J Mol Sci. 2025 Mar 3;26(5):2248. doi: 10.3390/ijms26052248.

Phylogenetic analysis and detection of positive selection in the SIRT gene family across vertebrates.脊椎动物SIRT基因家族的系统发育分析及正选择检测

Sci Rep. 2025 Jan 5;15(1):848. doi: 10.1038/s41598-025-85344-0.

Molecular evolution and genetic diversity of defective chorion 1 in Anastrepha fraterculus and Anastrepha obliqua (Diptera, Tephritidae).小实蝇和斜带实蝇（双翅目：实蝇科）中绒毛蛋白缺陷1的分子进化与遗传多样性

Dev Genes Evol. 2024 Dec;234(2):153-171. doi: 10.1007/s00427-024-00723-3. Epub 2024 Nov 7.

Bank vole genomics links determinate and indeterminate growth of teeth.鼩鼱基因组学关联牙齿的确定性和非确定性生长。

BMC Genomics. 2024 Oct 30;25(1):1000. doi: 10.1186/s12864-024-10901-2.

Relaxed selection in evolution of genes regulating limb development gives clue to variation in forelimb morphology of cetaceans and other mammals.基因调控肢体发育的进化中的松弛选择为鲸类和其他哺乳动物前肢形态的变异提供了线索。

Proc Biol Sci. 2024 Oct;291(2032):20241106. doi: 10.1098/rspb.2024.1106. Epub 2024 Oct 9.

Evolutionary analysis of ZAP and its cofactors identifies intrinsically disordered regions as central elements in host-pathogen interactions.ZAP及其辅助因子的进化分析表明，内在无序区域是宿主-病原体相互作用的核心要素。

Comput Struct Biotechnol J. 2024 Aug 2;23:3143-3154. doi: 10.1016/j.csbj.2024.07.022. eCollection 2024 Dec.

Evolution of Virus-like Features and Intrinsically Disordered Regions in Retrotransposon-derived Mammalian Genes.逆转座子衍生的哺乳动物基因中病毒样特征和固有无序区的进化。

Mol Biol Evol. 2024 Aug 2;41(8). doi: 10.1093/molbev/msae154.

CAM evolution is associated with gene family expansion in an explosive bromeliad radiation.CAM 演化与爆发性凤梨科辐射中的基因家族扩张有关。

Plant Cell. 2024 Oct 3;36(10):4109-4131. doi: 10.1093/plcell/koae130.

本文引用的文献

False-positive selection identified by ML-based methods: examples from the Sig1 gene of the diatom Thalassiosira weissflogii and the tax gene of a human T-cell lymphotropic virus.基于机器学习方法识别出的假阳性选择：来自硅藻威氏海链藻Sig1基因和人类嗜T细胞病毒tax基因的实例

Mol Biol Evol. 2004 May;21(5):914-21. doi: 10.1093/molbev/msh098. Epub 2004 Mar 10.

Comparative study of adaptive molecular evolution in different human immunodeficiency virus groups and subtypes.不同人类免疫缺陷病毒组和亚型中适应性分子进化的比较研究。

J Virol. 2004 Feb;78(4):1962-70. doi: 10.1128/jvi.78.4.1962-1970.2004.

MHC studies in nonmodel vertebrates: what have we learned about natural selection in 15 years?非模式脊椎动物中的主要组织相容性复合体（MHC）研究：15年来我们对自然选择有哪些了解？

J Evol Biol. 2003 May;16(3):363-77. doi: 10.1046/j.1420-9101.2003.00531.x.

Effect of recombination on the accuracy of the likelihood method for detecting positive selection at amino acid sites.重组对检测氨基酸位点正选择的似然方法准确性的影响。

Genetics. 2003 Jul;164(3):1229-36. doi: 10.1093/genetics/164.3.1229.

The effect of positive selection on a sexual reproduction gene in Thalassiosira weissflogii (Bacillariophyta): results obtained from maximum-likelihood and parsimony-based methods.正选择对威氏海链藻（硅藻门）一个有性生殖基因的影响：基于最大似然法和简约法获得的结果

Mol Biol Evol. 2003 Aug;20(8):1326-8. doi: 10.1093/molbev/msg145. Epub 2003 May 30.

Pervasive adaptive evolution in mammalian fertilization proteins.哺乳动物受精蛋白中的普遍适应性进化。

Mol Biol Evol. 2003 Jan;20(1):18-20. doi: 10.1093/oxfordjournals.molbev.a004233.

Simulation study of the reliability and robustness of the statistical methods for detecting positive selection at single amino acid sites.检测单个氨基酸位点正选择的统计方法的可靠性和稳健性的模拟研究

Mol Biol Evol. 2002 Nov;19(11):1865-9. doi: 10.1093/oxfordjournals.molbev.a004010.

Tracking adaptive evolutionary events in genomic sequences.追踪基因组序列中的适应性进化事件。

Genome Biol. 2002;3(6):REVIEWS1018. doi: 10.1186/gb-2002-3-6-reviews1018. Epub 2002 May 29.

Accuracy and power of bayes prediction of amino acid sites under positive selection.正选择下氨基酸位点的贝叶斯预测的准确性和功效。

Mol Biol Evol. 2002 Jun;19(6):950-8. doi: 10.1093/oxfordjournals.molbev.a004152.

The rapid evolution of reproductive proteins.生殖蛋白的快速进化。

Nat Rev Genet. 2002 Feb;3(2):137-44. doi: 10.1038/nrg733.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验