超越突变：在蛋白质进化分析中考虑定量变化。

Beyond mutations: Accounting for quantitative changes in the analysis of protein evolution.

作者信息

Wu Xiaoyong, Rai Shesh N, Weber Georg F

机构信息

Biostatistics and Informatics Shared Resources, University of Cincinnati Cancer Center, College of Medicine, Cincinnati, OH, USA.

Cancer Data Science Center, University of Cincinnati College of Medicine Department of Biostatistics, Health Informatice and Data Sciences, Cincinnati, OH, USA.

出版信息

Comput Struct Biotechnol J. 2024 Jun 21;23:2637-2647. doi: 10.1016/j.csbj.2024.06.017. eCollection 2024 Dec.

DOI:10.1016/j.csbj.2024.06.017

PMID:39021584

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11253266/

Abstract

Molecular phylogenetic research has relied on the analysis of the coding sequences by genes or of the amino acid sequences by the encoded proteins. Enumerating the numbers of mismatches, being indicators of mutation, has been central to pertinent algorithms. Specific amino acids possess quantifiable characteristics that enable the conversion from "words" (strings of letters denoting amino acids or bases) to "waves" (strings of quantitative values representing the physico-chemical properties) or to matrices (coordinates representing the positions in a comprehensive property space). The application of such numerical representations to evolutionary analysis takes into account not only the occurrence of mutations but also their properties as influences that drive speciation, because selective pressures favor certain mutations over others, and this predilection is represented in the characteristics of the incorporated amino acids (it is not born out solely by the mismatches). Besides being more discriminating sources for tree-generating algorithms than match/mismatch, the number strings can be examined for overall similarity with average mutual information, autocorrelation, and fractal dimension. Bivariate wavelet analysis aids in distinguishing hypermutable versus conserved domains of the protein. The matrix depiction is readily subjected to comparisons of distances, and it allows the generation of heat maps or graphs. This analysis preserves the accepted taxa order where tree construction with standard approaches yields conflicting results (for the protein S100A6). It also aids hypothesis generation about the origin of mitochondrial proteins. These analytical algorithms have been automated in R and are applicable to various processes that are describable in matrix format.

摘要

分子系统发育研究依赖于对基因编码序列或其编码蛋白质的氨基酸序列进行分析。计算错配数（作为突变的指标）一直是相关算法的核心。特定氨基酸具有可量化的特征，这使得能够从“单词”（表示氨基酸或碱基的字母串）转换为“波”（表示物理化学性质的数值串）或矩阵（表示综合性质空间中位置的坐标）。将这种数值表示应用于进化分析，不仅考虑了突变的发生，还考虑了它们作为驱动物种形成的影响因素的性质，因为选择压力有利于某些突变而非其他突变，这种偏好体现在所含氨基酸的特征中（不仅仅由错配体现）。除了比匹配/错配更具区分性地为树生成算法提供数据来源外，还可以通过平均互信息、自相关和分形维数来检查数字串的整体相似性。二元小波分析有助于区分蛋白质的高变区和保守区。矩阵描述易于进行距离比较，并允许生成热图或图表。这种分析在使用标准方法构建树产生冲突结果时（对于蛋白质S100A6）保留了公认的分类单元顺序。它还有助于生成关于线粒体蛋白质起源的假设。这些分析算法已在R中自动化，适用于各种可描述为矩阵格式的过程。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/721f/11253266/ecfa2cc68269/gr1.jpg

相似文献

Beyond mutations: Accounting for quantitative changes in the analysis of protein evolution.超越突变：在蛋白质进化分析中考虑定量变化。

Comput Struct Biotechnol J. 2024 Jun 21;23:2637-2647. doi: 10.1016/j.csbj.2024.06.017. eCollection 2024 Dec.

Quantitative Analysis of Protein Evolution: The Phylogeny of Osteopontin.蛋白质进化的定量分析：骨桥蛋白的系统发育

Front Genet. 2021 Aug 16;12:700789. doi: 10.3389/fgene.2021.700789. eCollection 2021.

Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区，服用抗叶酸抗疟药物的人群中，叶酸补充剂与疟疾易感性和严重程度的关系。

Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.

FFP: joint Fast Fourier transform and fractal dimension in amino acid property-aware phylogenetic analysis.FFP：氨基酸特性感知系统发育分析中的联合快速傅里叶变换和分形维数。

BMC Bioinformatics. 2022 Aug 19;23(1):347. doi: 10.1186/s12859-022-04889-3.

On combining protein sequences and nucleic acid sequences in phylogenetic analysis: the homeobox protein case.系统发育分析中蛋白质序列与核酸序列的结合：同源异型框蛋白实例

Cladistics. 1996;12:65-82. doi: 10.1111/j.1096-0031.1996.tb00193.x.

Cross-over between discrete and continuous protein structure space: insights into automatic classification and networks of protein structures.离散与连续蛋白质结构空间之间的交叉：对蛋白质结构自动分类及网络的见解。

PLoS Comput Biol. 2009 Mar;5(3):e1000331. doi: 10.1371/journal.pcbi.1000331. Epub 2009 Mar 27.

A protein evolution model with independent sites that reproduces site-specific amino acid distributions from the Protein Data Bank.一种具有独立位点的蛋白质进化模型，可从蛋白质数据库中重现位点特异性氨基酸分布。

BMC Evol Biol. 2006 May 31;6:43. doi: 10.1186/1471-2148-6-43.

On the quality of tree-based protein classification.论基于树的蛋白质分类的质量。

Bioinformatics. 2005 May 1;21(9):1876-90. doi: 10.1093/bioinformatics/bti244. Epub 2005 Jan 12.

Efficiencies of different genes and different tree-building methods in recovering a known vertebrate phylogeny.不同基因和不同建树方法在恢复已知脊椎动物系统发育关系方面的效率。

Mol Biol Evol. 1996 Mar;13(3):525-36. doi: 10.1093/oxfordjournals.molbev.a025613.

The ranging of amino acids substitution matrices of various types in accordance with the alignment accuracy criterion.根据比对准确性标准对各种类型氨基酸替换矩阵进行排序。

BMC Bioinformatics. 2020 Sep 14;21(Suppl 11):294. doi: 10.1186/s12859-020-03616-0.

本文引用的文献

Sequence-dependent and -independent information in a combined random energy model for protein folding and coding.序列相关和不相关信息在蛋白质折叠和编码的组合随机能模型中。

Proteins. 2024 May;92(5):679-687. doi: 10.1002/prot.26658. Epub 2023 Dec 29.

PCP consensus protein/peptide alphavirus antigens stimulate broad spectrum neutralizing antibodies.脊髓灰质炎病毒共识蛋白/肽甲病毒抗原刺激广谱中和抗体。

Peptides. 2022 Nov;157:170844. doi: 10.1016/j.peptides.2022.170844. Epub 2022 Jul 22.

Selection of Inosine 5'-Monophosphate Dehydrogenase Mutants in Solid Organ Transplant Recipients: Implication of Mycophenolic Acid.实体器官移植受者中5'-肌苷单磷酸脱氢酶突变体的选择：霉酚酸的影响

J Fungi (Basel). 2021 Oct 10;7(10):849. doi: 10.3390/jof7100849.

Quantitative Analysis of Protein Evolution: The Phylogeny of Osteopontin.蛋白质进化的定量分析：骨桥蛋白的系统发育

Front Genet. 2021 Aug 16;12:700789. doi: 10.3389/fgene.2021.700789. eCollection 2021.

DGraph Clusters Flaviviruses and β-Coronaviruses According to Their Hosts, Disease Type, and Human Cell Receptors.DGraph根据宿主、疾病类型和人类细胞受体对黄病毒和β冠状病毒进行聚类。

Bioinform Biol Insights. 2021 Jun 7;15:11779322211020316. doi: 10.1177/11779322211020316. eCollection 2021.

Characterizing the ecological and evolutionary dynamics of cancer.描述癌症的生态和进化动态。

Nat Genet. 2020 Aug;52(8):759-767. doi: 10.1038/s41588-020-0668-4. Epub 2020 Jul 27.

Endosymbiosis before eukaryotes: mitochondrial establishment in protoeukaryotes.真核生物出现前的内共生：原核生物中线粒体的建立。

Cell Mol Life Sci. 2020 Sep;77(18):3503-3523. doi: 10.1007/s00018-020-03462-6. Epub 2020 Feb 1.

GHOST: Recovering Historical Signal from Heterotachously Evolved Sequence Alignments.GHOST：从异速进化的序列比对中恢复历史信号。

Syst Biol. 2020 Mar 1;69(2):249-264. doi: 10.1093/sysbio/syz051.

A Machine Learning Method for Detecting Autocorrelation of Evolutionary Rates in Large Phylogenies.一种用于检测大型系统发育树中进化率自相关性的机器学习方法。

Mol Biol Evol. 2019 Apr 1;36(4):811-824. doi: 10.1093/molbev/msz014.

On the Origin of Species by Means of Natural Selection, or the Preservation of Favoured Races in the Struggle for Life.《物种起源》：通过自然选择，即生存斗争中有利种族的保存

Br Foreign Med Chir Rev. 1860 Apr;25(50):367-404.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

超越突变：在蛋白质进化分析中考虑定量变化。

Beyond mutations: Accounting for quantitative changes in the analysis of protein evolution.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

本文引用的文献