Suppr超能文献

在序列和结构丰富的时代评估基于共进化的残基-残基接触预测的效用。

Assessing the utility of coevolution-based residue-residue contact predictions in a sequence- and structure-rich era.

机构信息

Howard Hughes Medical Institute, Department of Biochemistry, and Molecular and Cellular Biology Program, University of Washington, Seattle, WA 98195.

出版信息

Proc Natl Acad Sci U S A. 2013 Sep 24;110(39):15674-9. doi: 10.1073/pnas.1314045110. Epub 2013 Sep 5.

Abstract

Recently developed methods have shown considerable promise in predicting residue-residue contacts in protein 3D structures using evolutionary covariance information. However, these methods require large numbers of evolutionarily related sequences to robustly assess the extent of residue covariation, and the larger the protein family, the more likely that contact information is unnecessary because a reasonable model can be built based on the structure of a homolog. Here we describe a method that integrates sequence coevolution and structural context information using a pseudolikelihood approach, allowing more accurate contact predictions from fewer homologous sequences. We rigorously assess the utility of predicted contacts for protein structure prediction using large and representative sequence and structure databases from recent structure prediction experiments. We find that contact predictions are likely to be accurate when the number of aligned sequences (with sequence redundancy reduced to 90%) is greater than five times the length of the protein, and that accurate predictions are likely to be useful for structure modeling if the aligned sequences are more similar to the protein of interest than to the closest homolog of known structure. These conditions are currently met by 422 of the protein families collected in the Pfam database.

摘要

最近开发的方法利用进化协方差信息在预测蛋白质 3D 结构中的残基-残基接触方面显示出了相当大的潜力。然而,这些方法需要大量进化相关的序列来稳健地评估残基协变的程度,而且蛋白质家族越大,接触信息就越不重要,因为可以基于同源物的结构构建合理的模型。在这里,我们描述了一种使用伪似然方法整合序列共进化和结构上下文信息的方法,允许从较少的同源序列中进行更准确的接触预测。我们使用来自最近结构预测实验的大型和代表性序列和结构数据库,严格评估预测接触对蛋白质结构预测的效用。我们发现,当对齐序列的数量(序列冗余减少到 90%)大于蛋白质长度的五倍时,接触预测很可能是准确的,如果对齐序列与感兴趣的蛋白质比与已知结构的最接近同源物更相似,那么准确的预测很可能对结构建模有用。目前,Pfam 数据库中收集的 422 个蛋白质家族都满足这些条件。

相似文献

9
Predicting residue-residue contacts using random forest models.利用随机森林模型预测残基-残基接触。
Bioinformatics. 2011 Dec 15;27(24):3379-84. doi: 10.1093/bioinformatics/btr579. Epub 2011 Oct 20.

引用本文的文献

7
Hierarchical design of pseudosymmetric protein nanocages.伪对称蛋白质纳米笼的层次设计
Nature. 2025 Feb;638(8050):553-561. doi: 10.1038/s41586-024-08360-6. Epub 2024 Dec 18.

本文引用的文献

1
Improved contact prediction in proteins: using pseudolikelihoods to infer Potts models.蛋白质中改进的接触预测:使用伪似然性推断Potts模型。
Phys Rev E Stat Nonlin Soft Matter Phys. 2013 Jan;87(1):012707. doi: 10.1103/PhysRevE.87.012707. Epub 2013 Jan 11.
2
Reliable and robust detection of coevolving protein residues.可靠且稳健的共进化蛋白质残基检测。
Protein Eng Des Sel. 2012 Nov;25(11):705-13. doi: 10.1093/protein/gzs081. Epub 2012 Oct 16.
3
Genomics-aided structure prediction.基于基因组学的结构预测。
Proc Natl Acad Sci U S A. 2012 Jun 26;109(26):10340-5. doi: 10.1073/pnas.1207864109. Epub 2012 Jun 12.
9
Learning generative models for protein fold families.学习蛋白质折叠家族的生成模型。
Proteins. 2011 Apr;79(4):1061-78. doi: 10.1002/prot.22934. Epub 2011 Jan 25.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验