• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

检测突变簇

Detecting clusters of mutations.

作者信息

Zhou Tong, Enyeart Peter J, Wilke Claus O

机构信息

Center for Computational Biology and Bioinformatics, Section of Integrative Biology, University of Texas at Austin, Austin, Texas, United States of America.

出版信息

PLoS One. 2008;3(11):e3765. doi: 10.1371/journal.pone.0003765. Epub 2008 Nov 19.

DOI:10.1371/journal.pone.0003765
PMID:19018282
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2582452/
Abstract

Positive selection for protein function can lead to multiple mutations within a small stretch of DNA, i.e., to a cluster of mutations. Recently, Wagner proposed a method to detect such mutation clusters. His method, however, did not take into account that residues with high solvent accessibility are inherently more variable than residues with low solvent accessibility. Here, we propose a new algorithm to detect clustered evolution. Our algorithm controls for different substitution probabilities at buried and exposed sites in the tertiary protein structure, and uses random permutations to calculate accurate P values for inferred clusters. We apply the algorithm to genomes of bacteria, fly, and mammals, and find several clusters of mutations in functionally important regions of proteins. Surprisingly, clustered evolution is a relatively rare phenomenon. Only between 2% and 10% of the genes we analyze contain a statistically significant mutation cluster. We also find that not controlling for solvent accessibility leads to an excess of clusters in terminal and solvent-exposed regions of proteins. Our algorithm provides a novel method to identify functionally relevant divergence between groups of species. Moreover, it could also be useful to detect artifacts in automatically assembled genomes.

摘要

蛋白质功能的正向选择可导致一小段DNA内出现多个突变,即突变簇。最近,瓦格纳提出了一种检测此类突变簇的方法。然而,他的方法没有考虑到溶剂可及性高的残基本质上比溶剂可及性低的残基更具变异性。在此,我们提出一种检测成簇进化的新算法。我们的算法控制三级蛋白质结构中埋藏位点和暴露位点不同的替换概率,并使用随机排列来计算推断簇的准确P值。我们将该算法应用于细菌、果蝇和哺乳动物的基因组,在蛋白质的功能重要区域发现了几个突变簇。令人惊讶的是,成簇进化是一种相对罕见的现象。我们分析的基因中只有2%到10%包含统计学上显著的突变簇。我们还发现,不控制溶剂可及性会导致蛋白质末端和溶剂暴露区域出现过多的簇。我们的算法提供了一种识别物种组之间功能相关差异的新方法。此外,它在检测自动组装基因组中的人为错误方面也可能有用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d6cb/2582452/7bfbc79fd213/pone.0003765.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d6cb/2582452/80ff45b346a5/pone.0003765.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d6cb/2582452/094f1f87280e/pone.0003765.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d6cb/2582452/0bcbeb8fdec4/pone.0003765.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d6cb/2582452/736f5e93d4d1/pone.0003765.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d6cb/2582452/5e7a242d6649/pone.0003765.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d6cb/2582452/65c72f1ea6ec/pone.0003765.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d6cb/2582452/031b9f4545cd/pone.0003765.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d6cb/2582452/88c0237df60c/pone.0003765.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d6cb/2582452/7bfbc79fd213/pone.0003765.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d6cb/2582452/80ff45b346a5/pone.0003765.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d6cb/2582452/094f1f87280e/pone.0003765.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d6cb/2582452/0bcbeb8fdec4/pone.0003765.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d6cb/2582452/736f5e93d4d1/pone.0003765.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d6cb/2582452/5e7a242d6649/pone.0003765.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d6cb/2582452/65c72f1ea6ec/pone.0003765.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d6cb/2582452/031b9f4545cd/pone.0003765.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d6cb/2582452/88c0237df60c/pone.0003765.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d6cb/2582452/7bfbc79fd213/pone.0003765.g009.jpg

相似文献

1
Detecting clusters of mutations.检测突变簇
PLoS One. 2008;3(11):e3765. doi: 10.1371/journal.pone.0003765. Epub 2008 Nov 19.
2
Prediction of protein solvent accessibility using fuzzy k-nearest neighbor method.使用模糊k近邻法预测蛋白质溶剂可及性。
Bioinformatics. 2005 Jun 15;21(12):2844-9. doi: 10.1093/bioinformatics/bti423. Epub 2005 Apr 6.
3
Discovering co-occurring patterns and their biological significance in protein families.发现蛋白质家族中的共现模式及其生物学意义。
BMC Bioinformatics. 2014;15 Suppl 12(Suppl 12):S2. doi: 10.1186/1471-2105-15-S12-S2. Epub 2014 Nov 6.
4
Codon-substitution models to detect adaptive evolution that account for heterogeneous selective pressures among site classes.用于检测适应性进化的密码子替换模型,该模型考虑了不同位点类之间的异质选择压力。
Mol Biol Evol. 2002 Jan;19(1):49-57. doi: 10.1093/oxfordjournals.molbev.a003981.
5
Robust sequence alignment using evolutionary rates coupled with an amino acid substitution matrix.使用进化速率结合氨基酸替换矩阵进行稳健的序列比对。
BMC Bioinformatics. 2015 Aug 14;16:255. doi: 10.1186/s12859-015-0688-8.
6
Mutations in the helix-turn-helix motif of the Escherichia coli UvrA protein eliminate its specificity for UV-damaged DNA.大肠杆菌UvrA蛋白螺旋-转角-螺旋基序中的突变消除了其对紫外线损伤DNA的特异性。
J Biol Chem. 1993 Mar 5;268(7):5323-31.
7
Stabilizing interactions in the dimer interface of alpha-subunit in Escherichia coli RNA polymerase: a graph spectral and point mutation study.大肠杆菌RNA聚合酶α亚基二聚体界面的稳定相互作用:图谱分析和点突变研究
Protein Sci. 2001 Jan;10(1):46-54. doi: 10.1110/ps.26201.
8
Analysis of homodimeric protein interfaces by graph-spectral methods.利用图谱方法分析同二聚体蛋白质界面
Protein Eng. 2002 Apr;15(4):265-77. doi: 10.1093/protein/15.4.265.
9
Utilizing protein structure to identify non-random somatic mutations.利用蛋白质结构鉴定非随机体细胞突变。
BMC Bioinformatics. 2013 Jun 13;14:190. doi: 10.1186/1471-2105-14-190.
10
Use of amino acid environment-dependent substitution tables and conformational propensities in structure prediction from aligned sequences of homologous proteins. I. Solvent accessibility classes.氨基酸环境依赖性取代表和构象倾向在同源蛋白质比对序列结构预测中的应用。I. 溶剂可及性类别。
J Mol Biol. 1994 May 20;238(5):682-92. doi: 10.1006/jmbi.1994.1329.

引用本文的文献

1
Adaptive genetics reveals constraints on protein structure/function by evolving E. coli under constant nutrient limitation.适应性遗传学通过在恒定营养限制条件下培养大肠杆菌揭示了对蛋白质结构/功能的限制。
BMC Biol. 2025 Aug 20;23(1):261. doi: 10.1186/s12915-025-02331-7.
2
Proteome-wide assessment of differential missense variant clustering in neurodevelopmental disorders and cancer.神经发育障碍和癌症中错义变异差异聚类的全蛋白质组评估
Cell Genom. 2025 Apr 9;5(4):100807. doi: 10.1016/j.xgen.2025.100807. Epub 2025 Mar 11.
3
Proteome-Wide Assessment of Clustering of Missense Variants in Neurodevelopmental Disorders Versus Cancer.

本文引用的文献

1
Elucidation of phenotypic adaptations: Molecular analyses of dim-light vision proteins in vertebrates.表型适应的阐释:脊椎动物弱光视觉蛋白的分子分析
Proc Natl Acad Sci U S A. 2008 Sep 9;105(36):13480-5. doi: 10.1073/pnas.0802426105. Epub 2008 Sep 3.
2
The origin of adaptive phenotypes.适应性表型的起源。
Proc Natl Acad Sci U S A. 2008 Sep 9;105(36):13193-4. doi: 10.1073/pnas.0807440105. Epub 2008 Sep 3.
3
Mistranslation-induced protein misfolding as a dominant constraint on coding-sequence evolution.错误翻译导致的蛋白质错误折叠是编码序列进化的主要限制因素。
神经发育障碍与癌症中错义变异聚类的全蛋白质组评估
medRxiv. 2024 Feb 4:2024.02.02.24302238. doi: 10.1101/2024.02.02.24302238.
4
The roles of antimicrobial resistance, phage diversity, isolation source and selection in shaping the genomic architecture of .抗生素耐药性、噬菌体多样性、分离源和选择在塑造. 的基因组结构中的作用。
Microb Genom. 2021 Aug;7(8). doi: 10.1099/mgen.0.000616.
5
Evolutionary selection of biofilm-mediated extended phenotypes in Yersinia pestis in response to a fluctuating environment.鼠疫耶尔森氏菌生物膜介导的扩展表型在应对环境波动中的进化选择。
Nat Commun. 2020 Jan 15;11(1):281. doi: 10.1038/s41467-019-14099-w.
6
Leveraging protein quaternary structure to identify oncogenic driver mutations.利用蛋白质四级结构来识别致癌驱动突变。
BMC Bioinformatics. 2016 Mar 22;17:137. doi: 10.1186/s12859-016-0963-3.
7
mutation3D: Cancer Gene Prediction Through Atomic Clustering of Coding Variants in the Structural Proteome.Mutation3D:通过结构蛋白质组中编码变异的原子聚类进行癌症基因预测。
Hum Mutat. 2016 May;37(5):447-56. doi: 10.1002/humu.22963. Epub 2016 Feb 18.
8
A spatial simulation approach to account for protein structure when identifying non-random somatic mutations.当识别非随机体细胞突变时,考虑蛋白质结构的空间模拟方法。
BMC Bioinformatics. 2014 Jul 3;15:231. doi: 10.1186/1471-2105-15-231.
9
Detecting patches of protein sites of influenza A viruses under positive selection.检测甲型流感病毒正选择下的蛋白质位点斑块。
Mol Biol Evol. 2012 Aug;29(8):2063-71. doi: 10.1093/molbev/mss095. Epub 2012 Mar 16.
10
The non-random clustering of non-synonymous substitutions and its relationship to evolutionary rate.非同义替换的非随机聚类及其与进化率的关系。
BMC Genomics. 2011 Aug 16;12:415. doi: 10.1186/1471-2164-12-415.
Cell. 2008 Jul 25;134(2):341-52. doi: 10.1016/j.cell.2008.05.042.
4
Codon-based tests of positive selection, branch lengths, and the evolution of mammalian immune system genes.基于密码子的正选择测试、分支长度与哺乳动物免疫系统基因的进化
Immunogenetics. 2008 Sep;60(9):495-506. doi: 10.1007/s00251-008-0304-4. Epub 2008 Jun 26.
5
Structural mapping of protein interactions reveals differences in evolutionary pressures correlated to mRNA level and protein abundance.蛋白质相互作用的结构图谱揭示了与mRNA水平和蛋白质丰度相关的进化压力差异。
Structure. 2007 Nov;15(11):1442-51. doi: 10.1016/j.str.2007.09.010.
6
Human PAML browser: a database of positive selection on human genes using phylogenetic methods.人类PAML浏览器:一个使用系统发育方法对人类基因进行正选择分析的数据库。
Nucleic Acids Res. 2008 Jan;36(Database issue):D800-8. doi: 10.1093/nar/gkm764. Epub 2007 Oct 25.
7
Looking for Darwin in all the wrong places: the misguided quest for positive selection at the nucleotide sequence level.在所有错误的地方寻找达尔文:在核苷酸序列水平上对正选择的错误探寻。
Heredity (Edinb). 2007 Oct;99(4):364-73. doi: 10.1038/sj.hdy.6801031. Epub 2007 Jul 11.
8
Rapid detection of positive selection in genes and genomes through variation clusters.通过变异簇快速检测基因和基因组中的正选择。
Genetics. 2007 Aug;176(4):2451-63. doi: 10.1534/genetics.107.074732. Epub 2007 Jul 1.
9
Quantifying the impact of protein tertiary structure on molecular evolution.量化蛋白质三级结构对分子进化的影响。
Mol Biol Evol. 2007 Aug;24(8):1769-82. doi: 10.1093/molbev/msm097. Epub 2007 May 23.
10
A tale of two tails: why are terminal residues of proteins exposed?双尾之谈:蛋白质的末端残基为何会暴露在外?
Bioinformatics. 2007 Jan 15;23(2):e225-30. doi: 10.1093/bioinformatics/btl318.