结构系统发生学可信度评估。

Structural Phylogenetics with Confidence.

机构信息

Centre for Theoretical Chemistry and Physics, School of Natural and Computational Sciences, Massey University Auckland, Auckland, New Zealand.

Bioinformatics Institute, Agency for Science, Technology and Research, Singapore.

出版信息

Mol Biol Evol. 2020 Sep 1;37(9):2711-2726. doi: 10.1093/molbev/msaa100.

DOI:10.1093/molbev/msaa100

PMID:32302382

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7475046/

Abstract

For evaluating the deepest evolutionary relationships among proteins, sequence similarity is too low for application of sequence-based homology search or phylogenetic methods. In such cases, comparison of protein structures, which are often better conserved than sequences, may provide an alternative means of uncovering deep evolutionary signal. Although major protein structure databases such as SCOP and CATH hierarchically group protein structures, they do not describe the specific evolutionary relationships within a hierarchical level. Structural phylogenies have the potential to fill this gap. However, it is difficult to assess evolutionary relationships derived from structural phylogenies without some means of assessing confidence in such trees. We therefore address two shortcomings in the application of structural data to deep phylogeny. First, we examine whether phylogenies derived from pairwise structural comparisons are sensitive to differences in protein length and shape. We find that structural phylogenetics is best employed where structures have very similar lengths, and that shape fluctuations generated during molecular dynamics simulations impact pairwise comparisons, but not so drastically as to eliminate evolutionary signal. Second, we address the absence of statistical support for structural phylogeny. We present a method for assessing confidence in a structural phylogeny using shape fluctuations generated via molecular dynamics or Monte Carlo simulations of proteins. Our approach will aid the evolutionary reconstruction of relationships across structurally defined protein superfamilies. With the Protein Data Bank now containing in excess of 158,000 entries (December 2019), we predict that structural phylogenetics will become a useful tool for ordering the protein universe.

摘要

为了评估蛋白质之间最深远的进化关系，序列相似性太低，无法应用基于序列的同源搜索或系统发育方法。在这种情况下，比较蛋白质结构可能是一种替代方法，可以揭示深层次的进化信号。尽管 SCOP 和 CATH 等主要蛋白质结构数据库对蛋白质结构进行了层次分组，但它们并没有描述层次结构内的特定进化关系。结构系统发育学具有填补这一空白的潜力。然而，如果没有某种方法来评估这些树的置信度，就很难评估从结构系统发育学中得出的进化关系。因此，我们解决了将结构数据应用于深度系统发育学的两个缺点。首先，我们检查从两两结构比较中得出的系统发育是否对蛋白质长度和形状的差异敏感。我们发现，结构系统发生学在结构非常相似的情况下最适用，并且分子动力学模拟中产生的形状波动会影响两两比较，但不会严重到消除进化信号。其次，我们解决了结构系统发育缺乏统计支持的问题。我们提出了一种使用通过分子动力学或蒙特卡罗模拟蛋白质生成的形状波动来评估结构系统发育置信度的方法。我们的方法将有助于在结构定义的蛋白质超家族中重建关系。随着蛋白质数据库现在包含超过 158,000 个条目（2019 年 12 月），我们预测结构系统发生学将成为一种有用的工具，用于对蛋白质宇宙进行排序。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/38a8/7475046/14d411e7cfd9/msaa100f1.jpg

相似文献

Structural Phylogenetics with Confidence.结构系统发生学可信度评估。

Mol Biol Evol. 2020 Sep 1;37(9):2711-2726. doi: 10.1093/molbev/msaa100.

[Foundations of the new phylogenetics].[新系统发育学的基础]

Zh Obshch Biol. 2004 Jul-Aug;65(4):334-66.

Gene family phylogenetics: tracing protein evolution on trees.基因家族系统发育学：在进化树上追踪蛋白质的进化

EXS. 2002(92):191-207. doi: 10.1007/978-3-0348-8114-2_14.

Monte Carlo simulation in phylogenies: an application to test the constancy of evolutionary rates.系统发育中的蒙特卡罗模拟：用于检验进化速率恒定性的应用

J Mol Evol. 1994 Mar;38(3):305-9. doi: 10.1007/BF00176093.

Use of structural phylogenetic networks for classification of the ferritin-like superfamily.利用结构系统发生网络对铁蛋白超家族进行分类。

J Biol Chem. 2012 Jun 8;287(24):20565-75. doi: 10.1074/jbc.M112.367458. Epub 2012 Apr 25.

Bayesian coestimation of phylogeny and sequence alignment.系统发育与序列比对的贝叶斯联合估计

BMC Bioinformatics. 2005 Apr 1;6:83. doi: 10.1186/1471-2105-6-83.

Divergent evolution within protein superfolds inferred from profile-based phylogenetics.基于序列谱的系统发育学推断蛋白质超折叠中的趋异进化。

J Mol Biol. 2005 Dec 2;354(3):722-37. doi: 10.1016/j.jmb.2005.08.071. Epub 2005 Sep 20.

Evaluating the relationship between evolutionary divergence and phylogenetic accuracy in AFLP data sets.评估 AFLP 数据集内进化分歧与系统发育准确性之间的关系。

Mol Biol Evol. 2010 May;27(5):988-1000. doi: 10.1093/molbev/msp315. Epub 2009 Dec 21.

PASS2: an automated database of protein alignments organised as structural superfamilies.PASS2：一个以结构超家族形式组织的蛋白质比对自动化数据库。

BMC Bioinformatics. 2004 Apr 2;5:35. doi: 10.1186/1471-2105-5-35.

Bayesian inference of phylogeny and its impact on evolutionary biology.系统发育的贝叶斯推断及其对进化生物学的影响。

Science. 2001 Dec 14;294(5550):2310-4. doi: 10.1126/science.1065889.

引用本文的文献

Protein Structural Phylogenetics.蛋白质结构系统发育学

Genome Biol Evol. 2025 Jul 30;17(8). doi: 10.1093/gbe/evaf139.

The evolutionary history and modern diversity of triterpenoid cyclases.三萜环化酶的进化史与现代多样性

bioRxiv. 2025 Aug 2:2024.10.28.620730. doi: 10.1101/2024.10.28.620730.

Exploring the Structural Diversity and Evolution of the D1 Subunit of Photosystem II Using AlphaFold and Foldtree.利用AlphaFold和Foldtree探索光系统II的D1亚基的结构多样性和进化

Physiol Plant. 2025 May-Jun;177(3):e70284. doi: 10.1111/ppl.70284.

A fast approach for structural and evolutionary analysis based on energetic profile protein comparison.一种基于能量分布蛋白质比较的结构与进化分析快速方法。

Nat Commun. 2025 Mar 6;16(1):2231. doi: 10.1038/s41467-025-57374-9.

Artificial intelligence for modelling infectious disease epidemics.用于传染病流行建模的人工智能

Nature. 2025 Feb;638(8051):623-635. doi: 10.1038/s41586-024-08564-w. Epub 2025 Feb 19.

multistrap: boosting phylogenetic analyses with structural information.多重带型：利用结构信息提升系统发育分析

Nat Commun. 2025 Jan 15;16(1):293. doi: 10.1038/s41467-024-55264-0.

Faithful Interpretation of Protein Structures through Weighted Persistent Homology Improves Evolutionary Distance Estimation.通过加权持久同调对蛋白质结构进行忠实解释可改进进化距离估计。

Mol Biol Evol. 2025 Feb 3;42(2). doi: 10.1093/molbev/msae271.

Challenges in Assembling the Dated Tree of Life.组装有年代的生命之树的挑战。

Genome Biol Evol. 2024 Oct 9;16(10). doi: 10.1093/gbe/evae229.

Structural and Evolutionary Analysis of Proteins Endowed with a Nucleotidyltransferase, or Non-canonical Palm, Catalytic Domain.具有核苷酸转移酶或非典型棕榈催化结构域的蛋白质的结构与进化分析

J Mol Evol. 2024 Dec;92(6):799-814. doi: 10.1007/s00239-024-10207-7. Epub 2024 Sep 19.

Using structure prediction of negative sense RNA virus nucleoproteins to assess evolutionary relationships.利用负链RNA病毒核蛋白的结构预测来评估进化关系。

Virus Evol. 2024 Jul 22;10(1):veae058. doi: 10.1093/ve/veae058. eCollection 2024.

本文引用的文献

SciPy 1.0: fundamental algorithms for scientific computing in Python.SciPy 1.0：Python 中的科学计算基础算法。

Nat Methods. 2020 Mar;17(3):261-272. doi: 10.1038/s41592-019-0686-2. Epub 2020 Feb 3.

The SCOP database in 2020: expanded classification of representative family and superfamily domains of known protein structures.2020 年的 SCOP 数据库：已知蛋白质结构的代表性家族和超家族域的扩展分类。

Nucleic Acids Res. 2020 Jan 8;48(D1):D376-D382. doi: 10.1093/nar/gkz1064.

CATH: expanding the horizons of structure-based functional annotations for genome sequences.CATH：扩展基于结构的基因组序列功能注释的视野。

Nucleic Acids Res. 2019 Jan 8;47(D1):D280-D284. doi: 10.1093/nar/gky1097.

Protein Data Bank: the single global archive for 3D macromolecular structure data.蛋白质数据库：用于存储大分子三维结构数据的全球单一档案库。

Nucleic Acids Res. 2019 Jan 8;47(D1):D520-D528. doi: 10.1093/nar/gky949.

Positive Selection or Free to Vary? Assessing the Functional Significance of Sequence Change Using Molecular Dynamics.正向选择还是自由变异？利用分子动力学评估序列变化的功能意义。

PLoS One. 2016 Feb 12;11(2):e0147619. doi: 10.1371/journal.pone.0147619. eCollection 2016.

The origin and evolution of ribonucleotide reduction.核糖核苷酸还原的起源与进化。

Life (Basel). 2015 Feb 27;5(1):604-36. doi: 10.3390/life5010604.

Simultaneous Bayesian estimation of alignment and phylogeny under a joint model of protein sequence and structure.在蛋白质序列与结构联合模型下对序列比对和系统发育进行同步贝叶斯估计。

Mol Biol Evol. 2014 Sep;31(9):2251-66. doi: 10.1093/molbev/msu184. Epub 2014 Jun 4.

PHAISTOS: a framework for Markov chain Monte Carlo simulation and inference of protein structure.PHAISTOS：用于蛋白质结构的马尔可夫链蒙特卡罗模拟和推断的框架。

J Comput Chem. 2013 Jul 15;34(19):1697-705. doi: 10.1002/jcc.23292. Epub 2013 Apr 26.

Optimization of the additive CHARMM all-atom protein force field targeting improved sampling of the backbone φ, ψ and side-chain χ(1) and χ(2) dihedral angles.针对主链φ、ψ以及侧链χ(1)和χ(2)二面角改进采样的CHARMM全原子蛋白质加性力场的优化。

J Chem Theory Comput. 2012 Sep 11;8(9):3257-3273. doi: 10.1021/ct300400x. Epub 2012 Jul 18.

Bio.Phylo: a unified toolkit for processing, analyzing and visualizing phylogenetic trees in Biopython.Bio.Phylo：Biopython 中用于处理、分析和可视化系统发育树的统一工具包。

BMC Bioinformatics. 2012 Aug 21;13:209. doi: 10.1186/1471-2105-13-209.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

结构系统发生学可信度评估。

Structural Phylogenetics with Confidence.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献