• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于相关突变分析的最佳数据收集。

Optimal data collection for correlated mutation analysis.

作者信息

Ashkenazy Haim, Unger Ron, Kliger Yossef

机构信息

Compugen LTD, Tel Aviv 69512, Israel.

出版信息

Proteins. 2009 Feb 15;74(3):545-55. doi: 10.1002/prot.22168.

DOI:10.1002/prot.22168
PMID:18655065
Abstract

The main objective of correlated mutation analysis (CMA) is to predict intraprotein residue-residue interactions from sequence alone. Despite considerable progress in algorithms and computer capabilities, the performance of CMA methods remains quite low. Here we examine whether, and to what extent, the quality of CMA methods depends on the sequences that are included in the multiple sequence alignment (MSA). The results revealed a strong correlation between the number of homologs in an MSA and CMA prediction strength. Furthermore, many of the current methods include only orthologs in the MSA, we found that it is beneficial to include both orthologs and paralogs in the MSA. Remarkably, even remote homologs contribute to the improved accuracy. Based on our findings we put forward an automated data collection procedure, with a minimal coverage of 50% between the query protein and its orthologs and paralogs. This procedure improves accuracy even in the absence of manual curation. In this era of massive sequencing and exploding sequence data, our results suggest that correlated mutation-based methods have not reached their inherent performance limitations and that the role of CMA in structural biology is far from being fulfilled.

摘要

相关突变分析(CMA)的主要目标是仅从序列预测蛋白质内残基与残基之间的相互作用。尽管在算法和计算机性能方面取得了显著进展,但CMA方法的性能仍然相当低。在这里,我们研究CMA方法的质量是否以及在多大程度上取决于多序列比对(MSA)中包含的序列。结果显示,MSA中同源物的数量与CMA预测强度之间存在很强的相关性。此外,当前许多方法在MSA中仅包含直系同源物,我们发现将直系同源物和旁系同源物都包含在MSA中是有益的。值得注意的是,即使是远缘同源物也有助于提高准确性。基于我们的发现,我们提出了一种自动数据收集程序,查询蛋白与其直系同源物和旁系同源物之间的最小覆盖率为50%。即使在没有人工整理的情况下,该程序也能提高准确性。在这个大规模测序和序列数据爆炸的时代,我们的结果表明基于相关突变的方法尚未达到其固有的性能限制,并且CMA在结构生物学中的作用远未实现。

相似文献

1
Optimal data collection for correlated mutation analysis.用于相关突变分析的最佳数据收集。
Proteins. 2009 Feb 15;74(3):545-55. doi: 10.1002/prot.22168.
2
Reducing phylogenetic bias in correlated mutation analysis.减少相关突变分析中的系统发育偏差。
Protein Eng Des Sel. 2010 May;23(5):321-6. doi: 10.1093/protein/gzp078. Epub 2010 Jan 12.
3
Direct mapping and alignment of protein sequences onto genomic sequence.蛋白质序列到基因组序列的直接映射与比对。
Bioinformatics. 2008 Nov 1;24(21):2438-44. doi: 10.1093/bioinformatics/btn460. Epub 2008 Aug 26.
4
Protein structure prediction based on sequence similarity.基于序列相似性的蛋白质结构预测。
Methods Mol Biol. 2009;569:129-56. doi: 10.1007/978-1-59745-524-4_7.
5
Beyond the Twilight Zone: automated prediction of structural properties of proteins by recursive neural networks and remote homology information.超越模糊地带:利用递归神经网络和远程同源信息自动预测蛋白质的结构特性
Proteins. 2009 Oct;77(1):181-90. doi: 10.1002/prot.22429.
6
MeDor: a metaserver for predicting protein disorder.MeDor:一种用于预测蛋白质无序状态的元服务器。
BMC Genomics. 2008 Sep 16;9 Suppl 2(Suppl 2):S25. doi: 10.1186/1471-2164-9-S2-S25.
7
SSMap: a new UniProt-PDB mapping resource for the curation of structural-related information in the UniProt/Swiss-Prot Knowledgebase.SSMap:一种用于在UniProt/Swiss-Prot知识库中整理结构相关信息的新型UniProt-PDB映射资源。
BMC Bioinformatics. 2008 Sep 23;9:391. doi: 10.1186/1471-2105-9-391.
8
Accurate prediction for atomic-level protein design and its application in diversifying the near-optimal sequence space.原子水平蛋白质设计的准确预测及其在扩展近最优序列空间中的应用。
Proteins. 2009 May 15;75(3):682-705. doi: 10.1002/prot.22280.
9
COPid: composition based protein identification.COPid:基于成分的蛋白质鉴定
In Silico Biol. 2008;8(2):121-8.
10
Model-based prediction of sequence alignment quality.基于模型的序列比对质量预测。
Bioinformatics. 2008 Oct 1;24(19):2165-71. doi: 10.1093/bioinformatics/btn414. Epub 2008 Aug 4.

引用本文的文献

1
Petascale Homology Search for Structure Prediction.用于结构预测的千万亿次同源性搜索
bioRxiv. 2023 Jul 11:2023.07.10.548308. doi: 10.1101/2023.07.10.548308.
2
Evolutionary coupling analysis identifies the impact of disease-associated variants at less-conserved sites.进化关联分析确定了在不太保守的位点上与疾病相关变异的影响。
Nucleic Acids Res. 2019 Sep 19;47(16):e94. doi: 10.1093/nar/gkz536.
3
Improving protein-protein interaction prediction using evolutionary information from low-quality MSAs.利用来自低质量多序列比对的进化信息改进蛋白质-蛋白质相互作用预测。
PLoS One. 2017 Feb 6;12(2):e0169356. doi: 10.1371/journal.pone.0169356. eCollection 2017.
4
H2rs: deducing evolutionary and functionally important residue positions by means of an entropy and similarity based analysis of multiple sequence alignments.H2rs:通过对多个序列比对进行基于熵和相似性的分析来推断进化和功能重要的残基位置。
BMC Bioinformatics. 2014 Apr 27;15:118. doi: 10.1186/1471-2105-15-118.
5
Conserved and variable correlated mutations in the plant MADS protein network.植物 MADS 蛋白网络中的保守和可变相关突变。
BMC Genomics. 2010 Oct 28;11:607. doi: 10.1186/1471-2164-11-607.
6
Validation of coevolving residue algorithms via pipeline sensitivity analysis: ELSC and OMES and ZNMI, oh my!通过流水线敏感性分析对共进化残基算法进行验证:ELSC、OMES 和 ZNMI,哦,我的天!
PLoS One. 2010 Jun 1;5(6):e10779. doi: 10.1371/journal.pone.0010779.
7
Peptides modulating conformational changes in secreted chaperones: from in silico design to preclinical proof of concept.调节分泌型伴侣蛋白构象变化的肽:从计算机辅助设计到临床前概念验证
Proc Natl Acad Sci U S A. 2009 Aug 18;106(33):13797-801. doi: 10.1073/pnas.0906514106. Epub 2009 Aug 5.