• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

PC_ali:一种基于混合蛋白质序列和结构相似度得分的改进多重比对和进化推断的工具。

PC_ali: a tool for improved multiple alignments and evolutionary inference based on a hybrid protein sequence and structure similarity score.

机构信息

Centro de Biologia Molecular "Severo Ochoa" (CBMSO), CSIC-UAM Cantoblanco, 28049 Madrid, Spain.

Bioinformatics Facility CBMSO, CSIC-UAM Cantoblanco, 28049 Madrid, Spain.

出版信息

Bioinformatics. 2023 Nov 1;39(11). doi: 10.1093/bioinformatics/btad630.

DOI:10.1093/bioinformatics/btad630
PMID:37847775
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10628387/
Abstract

MOTIVATION

Evolutionary inference depends crucially on the quality of multiple sequence alignments (MSA), which is problematic for distantly related proteins. Since protein structure is more conserved than sequence, it seems natural to use structure alignments for distant homologs. However, structure alignments may not be suitable for inferring evolutionary relationships.

RESULTS

Here we examined four protein similarity measures that depend on sequence and structure (fraction of aligned residues, sequence identity, fraction of superimposed residues, and contact overlap), finding that they are intimately correlated but none of them provides a complete and unbiased picture of conservation in proteins. Therefore, we propose the new hybrid protein sequence and structure similarity score PC_sim based on their main principal component. The corresponding divergence measure PC_div shows the strongest correlation with divergences obtained from individual similarities, suggesting that it infers accurate evolutionary divergences. We developed the program PC_ali that constructs protein MSAs either de novo or modifying an input MSA, using a similarity matrix based on PC_sim. The program constructs a starting MSA based on the maximal cliques of the graph of these PAs and it refines it through progressive alignments along the tree reconstructed with PC_div. Compared with eight state-of-the-art multiple structure or sequence alignment tools, PC_ali achieves higher or equal aligned fraction and structural scores, sequence identity higher than structure aligners although lower than sequence aligners, highest score PC_sim, and highest similarity with the MSAs produced by other tools and with the reference MSA Balibase.

AVAILABILITY AND IMPLEMENTATION

https://github.com/ugobas/PC_ali.

摘要

动机

进化推断在很大程度上取决于多序列比对 (MSA) 的质量,而对于远缘蛋白质来说,这是一个问题。由于蛋白质结构比序列更保守,因此使用结构比对来推断远源同源物似乎是合理的。然而,结构比对可能并不适合推断进化关系。

结果

在这里,我们检查了四种依赖于序列和结构的蛋白质相似性度量(对齐残基数的分数、序列同一性、重叠残基数的分数和接触重叠),发现它们密切相关,但没有一种能够完整和无偏地描述蛋白质的保守性。因此,我们提出了新的混合蛋白质序列和结构相似性评分 PC_sim,基于它们的主要主成分。相应的分歧度量 PC_div 与从单个相似性获得的分歧显示出最强的相关性,表明它推断出准确的进化分歧。我们开发了程序 PC_ali,该程序可以从头构建蛋白质 MSAs 或修改输入的 MSA,使用基于 PC_sim 的相似性矩阵。该程序基于这些 PA 的图的最大团构建起始 MSA,并通过沿着使用 PC_div 重建的树进行渐进对齐来对其进行细化。与八个最先进的多结构或序列比对工具相比,PC_ali 实现了更高或相等的对齐分数和结构分数,序列同一性高于结构比对器,尽管低于序列比对器,最高的 PC_sim 评分,以及与其他工具生成的 MSAs 和 Balibase 参考 MSA 的最高相似性。

可用性和实现

https://github.com/ugobas/PC_ali。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8f10/10628387/ba29e741d006/btad630f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8f10/10628387/45b6cf336d8c/btad630f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8f10/10628387/d4e012bf1313/btad630f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8f10/10628387/16744143712f/btad630f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8f10/10628387/6156d69a7ba4/btad630f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8f10/10628387/ba29e741d006/btad630f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8f10/10628387/45b6cf336d8c/btad630f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8f10/10628387/d4e012bf1313/btad630f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8f10/10628387/16744143712f/btad630f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8f10/10628387/6156d69a7ba4/btad630f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8f10/10628387/ba29e741d006/btad630f5.jpg

相似文献

1
PC_ali: a tool for improved multiple alignments and evolutionary inference based on a hybrid protein sequence and structure similarity score.PC_ali:一种基于混合蛋白质序列和结构相似度得分的改进多重比对和进化推断的工具。
Bioinformatics. 2023 Nov 1;39(11). doi: 10.1093/bioinformatics/btad630.
2
Improving the alignment quality of consistency based aligners with an evaluation function using synonymous protein words.利用同义蛋白质词的评估函数提高一致性比对器的比对质量。
PLoS One. 2011;6(12):e27872. doi: 10.1371/journal.pone.0027872. Epub 2011 Dec 2.
3
Characterization of multiple sequence alignment errors using complete-likelihood score and position-shift map.使用完全似然得分和位置偏移图对多序列比对错误进行表征。
BMC Bioinformatics. 2016 Mar 18;17:133. doi: 10.1186/s12859-016-0945-5.
4
Calculating and scoring high quality multiple flexible protein structure alignments.计算并评分高质量的多重灵活蛋白质结构比对。
Bioinformatics. 2016 Sep 1;32(17):2650-8. doi: 10.1093/bioinformatics/btw300. Epub 2016 May 13.
5
Accuracy of structure-based sequence alignment of automatic methods.自动方法的基于结构的序列比对准确性。
BMC Bioinformatics. 2007 Sep 20;8:355. doi: 10.1186/1471-2105-8-355.
6
Iterative sequence/secondary structure search for protein homologs: comparison with amino acid sequence alignments and application to fold recognition in genome databases.用于蛋白质同源物的迭代序列/二级结构搜索:与氨基酸序列比对的比较及在基因组数据库中折叠识别的应用
Bioinformatics. 2000 Nov;16(11):988-1002. doi: 10.1093/bioinformatics/16.11.988.
7
AL2CO: calculation of positional conservation in a protein sequence alignment.AL2CO:蛋白质序列比对中位置保守性的计算
Bioinformatics. 2001 Aug;17(8):700-12. doi: 10.1093/bioinformatics/17.8.700.
8
RPfam: A refiner towards curated-like multiple sequence alignments of the Pfam protein families.RPfam:一个针对 Pfam 蛋白质家族精心整理的多重序列比对的工具。
J Bioinform Comput Biol. 2022 Aug;20(4):2240002. doi: 10.1142/S0219720022400029. Epub 2022 Apr 14.
9
Robust sequence alignment using evolutionary rates coupled with an amino acid substitution matrix.使用进化速率结合氨基酸替换矩阵进行稳健的序列比对。
BMC Bioinformatics. 2015 Aug 14;16:255. doi: 10.1186/s12859-015-0688-8.
10
Highly significant improvement of protein sequence alignments with AlphaFold2.使用 AlphaFold2 大幅提高蛋白质序列比对的精确度。
Bioinformatics. 2022 Nov 15;38(22):5007-5011. doi: 10.1093/bioinformatics/btac625.

引用本文的文献

1
Protein Structural Phylogenetics.蛋白质结构系统发育学
Genome Biol Evol. 2025 Jul 30;17(8). doi: 10.1093/gbe/evaf139.
2
Faithful Interpretation of Protein Structures through Weighted Persistent Homology Improves Evolutionary Distance Estimation.通过加权持久同调对蛋白质结构进行忠实解释可改进进化距离估计。
Mol Biol Evol. 2025 Feb 3;42(2). doi: 10.1093/molbev/msae271.
3
ProteinReDiff: Complex-based ligand-binding proteins redesign by equivariant diffusion-based generative models.ProteinReDiff:基于等变扩散生成模型的基于复合物的配体结合蛋白重新设计

本文引用的文献

1
Highly accurate protein structure prediction with AlphaFold.利用 AlphaFold 进行高精度蛋白质结构预测。
Nature. 2021 Aug;596(7873):583-589. doi: 10.1038/s41586-021-03819-2. Epub 2021 Jul 15.
2
The Molecular Clock in the Evolution of Protein Structures.蛋白质结构进化中的分子钟。
Syst Biol. 2019 Nov 1;68(6):987-1002. doi: 10.1093/sysbio/syz022.
3
Protein multiple alignments: sequence-based versus structure-based programs.蛋白质多重比对:基于序列与基于结构的程序。
Struct Dyn. 2024 Nov 25;11(6):064102. doi: 10.1063/4.0000271. eCollection 2024 Nov.
Bioinformatics. 2019 Oct 15;35(20):3970-3980. doi: 10.1093/bioinformatics/btz236.
4
mTM-align: an algorithm for fast and accurate multiple protein structure alignment.mTM-align:一种快速准确的多蛋白质结构比对算法。
Bioinformatics. 2018 May 15;34(10):1719-1725. doi: 10.1093/bioinformatics/btx828.
5
DECIPHER: harnessing local sequence context to improve protein multiple sequence alignment.DECIPHER:利用局部序列上下文来改进蛋白质多序列比对。
BMC Bioinformatics. 2015 Oct 6;16:322. doi: 10.1186/s12859-015-0749-z.
6
Evidence of Statistical Inconsistency of Phylogenetic Methods in the Presence of Multiple Sequence Alignment Uncertainty.在存在多序列比对不确定性的情况下系统发育方法统计不一致性的证据。
Genome Biol Evol. 2015 Jul 1;7(8):2102-16. doi: 10.1093/gbe/evv127.
7
Refinement by shifting secondary structure elements improves sequence alignments.通过改变二级结构元件来进行优化可提高序列比对的质量。
Proteins. 2015 Mar;83(3):411-27. doi: 10.1002/prot.24746. Epub 2015 Jan 13.
8
Alignment errors strongly impact likelihood-based tests for comparing topologies.排列错误会严重影响基于似然的拓扑比较检验。
Mol Biol Evol. 2014 Nov;31(11):3057-67. doi: 10.1093/molbev/msu231. Epub 2014 Aug 1.
9
Detecting selection on protein stability through statistical mechanical models of folding and evolution.通过折叠和进化的统计力学模型检测对蛋白质稳定性的选择。
Biomolecules. 2014 Mar 7;4(1):291-314. doi: 10.3390/biom4010291.
10
Emerging methods in protein co-evolution.蛋白质共进化的新兴方法。
Nat Rev Genet. 2013 Apr;14(4):249-61. doi: 10.1038/nrg3414. Epub 2013 Mar 5.