PSICOV：使用基于稀疏逆协方差估计的大型多重序列比对进行精确结构接触预测。

PSICOV: precise structural contact prediction using sparse inverse covariance estimation on large multiple sequence alignments.

机构信息

Department of Computer Science, Bioinformatics Group, Centre for Computational Statistics and Machine Learning, University College London, Malet Place, London WC1E 6BT, UK.

出版信息

Bioinformatics. 2012 Jan 15;28(2):184-90. doi: 10.1093/bioinformatics/btr638. Epub 2011 Nov 17.

DOI:10.1093/bioinformatics/btr638

PMID:22101153

Abstract

MOTIVATION

The accurate prediction of residue-residue contacts, critical for maintaining the native fold of a protein, remains an open problem in the field of structural bioinformatics. Interest in this long-standing problem has increased recently with algorithmic improvements and the rapid growth in the sizes of sequence families. Progress could have major impacts in both structure and function prediction to name but two benefits. Sequence-based contact predictions are usually made by identifying correlated mutations within multiple sequence alignments (MSAs), most commonly through the information-theoretic approach of calculating mutual information between pairs of sites in proteins. These predictions are often inaccurate because the true covariation signal in the MSA is often masked by biases from many ancillary indirect-coupling or phylogenetic effects. Here we present a novel method, PSICOV, which introduces the use of sparse inverse covariance estimation to the problem of protein contact prediction. Our method builds on work which had previously demonstrated corrections for phylogenetic and entropic correlation noise and allows accurate discrimination of direct from indirectly coupled mutation correlations in the MSA.

RESULTS

PSICOV displays a mean precision substantially better than the best performing normalized mutual information approach and Bayesian networks. For 118 out of 150 targets, the L/5 (i.e. top-L/5 predictions for a protein of length L) precision for long-range contacts (sequence separation >23) was ≥ 0.5, which represents an improvement sufficient to be of significant benefit in protein structure prediction or model quality assessment.

AVAILABILITY

The PSICOV source code can be downloaded from http://bioinf.cs.ucl.ac.uk/downloads/PSICOV.

摘要

动机

残基残基接触的准确预测对于维持蛋白质的天然折叠至关重要，这仍然是结构生物信息学领域的一个未解决的问题。随着算法的改进和序列家族规模的快速增长，人们对这个长期存在的问题的兴趣最近有所增加。这一进展可能对结构和功能预测产生重大影响，仅举两个好处。基于序列的接触预测通常是通过在多个序列比对（MSA）中识别相关突变来实现的，最常见的方法是通过计算蛋白质中对位点之间的互信息来计算信息论方法。这些预测通常不准确，因为 MSA 中的真实共变信号经常被许多辅助间接耦合或系统发育效应的偏差所掩盖。在这里，我们提出了一种新的方法 PSICOV，它将稀疏逆协方差估计引入到蛋白质接触预测问题中。我们的方法基于先前已经证明的用于校正系统发育和熵相关噪声的工作，并允许在 MSA 中准确区分直接和间接耦合突变相关性。

结果

PSICOV 的平均精度明显优于表现最好的归一化互信息方法和贝叶斯网络。对于 150 个目标中的 118 个，长程接触（序列分离>23）的 L/5（即蛋白质长度 L 的前 L/5 个预测）精度≥0.5，这足以显著提高蛋白质结构预测或模型质量评估的精度。

可用性

PSICOV 的源代码可以从 http://bioinf.cs.ucl.ac.uk/downloads/PSICOV 下载。

相似文献

PSICOV: precise structural contact prediction using sparse inverse covariance estimation on large multiple sequence alignments.PSICOV：使用基于稀疏逆协方差估计的大型多重序列比对进行精确结构接触预测。

Bioinformatics. 2012 Jan 15;28(2):184-90. doi: 10.1093/bioinformatics/btr638. Epub 2011 Nov 17.

MetaPSICOV: combining coevolution methods for accurate prediction of contacts and long range hydrogen bonding in proteins.MetaPSICOV：结合协同进化方法用于精确预测蛋白质中的接触和长程氢键

Bioinformatics. 2015 Apr 1;31(7):999-1006. doi: 10.1093/bioinformatics/btu791. Epub 2014 Nov 26.

COUSCOus: improved protein contact prediction using an empirical Bayes covariance estimator.COUSCOus：使用经验贝叶斯协方差估计器改进蛋白质接触预测。

BMC Bioinformatics. 2016 Dec 15;17(1):533. doi: 10.1186/s12859-016-1400-3.

PconsC: combination of direct information methods and alignments improves contact prediction.PconsC：直接信息方法和比对的组合提高了接触预测。

Bioinformatics. 2013 Jul 15;29(14):1815-6. doi: 10.1093/bioinformatics/btt259. Epub 2013 May 8.

Improving residue-residue contact prediction via low-rank and sparse decomposition of residue correlation matrix.通过残基相关矩阵的低秩和稀疏分解改进残基-残基接触预测。

Biochem Biophys Res Commun. 2016 Mar 25;472(1):217-22. doi: 10.1016/j.bbrc.2016.01.188. Epub 2016 Feb 23.

Improving accuracy of protein contact prediction using balanced network deconvolution.利用平衡网络去卷积提高蛋白质接触预测的准确性。

Proteins. 2015 Mar;83(3):485-96. doi: 10.1002/prot.24744. Epub 2015 Jan 24.

H2rs: deducing evolutionary and functionally important residue positions by means of an entropy and similarity based analysis of multiple sequence alignments.H2rs：通过对多个序列比对进行基于熵和相似性的分析来推断进化和功能重要的残基位置。

BMC Bioinformatics. 2014 Apr 27;15:118. doi: 10.1186/1471-2105-15-118.

KScons: a Bayesian approach for protein residue contact prediction using the knob-socket model of protein tertiary structure.KScons：一种使用蛋白质三级结构的旋钮-插座模型进行蛋白质残基接触预测的贝叶斯方法。

Bioinformatics. 2016 Dec 15;32(24):3774-3781. doi: 10.1093/bioinformatics/btw553. Epub 2016 Aug 24.

De novo structure prediction of globular proteins aided by sequence variation-derived contacts.基于序列变异衍生接触辅助的球状蛋白质从头结构预测。

PLoS One. 2014 Mar 17;9(3):e92197. doi: 10.1371/journal.pone.0092197. eCollection 2014.

High precision in protein contact prediction using fully convolutional neural networks and minimal sequence features.利用全卷积神经网络和最小序列特征进行高精度蛋白质接触预测。

Bioinformatics. 2018 Oct 1;34(19):3308-3315. doi: 10.1093/bioinformatics/bty341.

引用本文的文献

Improving prediction accuracy in chimeric proteins with windowed multiple sequence alignment.通过窗口化多序列比对提高嵌合蛋白的预测准确性。

Comput Struct Biotechnol J. 2025 Jul 23;27:3292-3298. doi: 10.1016/j.csbj.2025.07.039. eCollection 2025.

Structure Modeling Protocols for Protein Multimer and RNA in CASP16 With Enhanced MSAs, Model Ranking, and Deep Learning.利用增强型多序列比对、模型排序和深度学习的CASP16中蛋白质多聚体和RNA的结构建模协议

Proteins. 2025 Aug 1. doi: 10.1002/prot.70033.

PHACE: Phylogeny-Aware Detection of Molecular Coevolution.PHACE：分子协同进化的系统发育感知检测

Mol Biol Evol. 2025 Jul 1;42(7). doi: 10.1093/molbev/msaf150.

Transmembrane Homodimers Interface Identification: Predicting Interface Residues in Alpha-Helical Transmembrane Protein Homodimers Using Sequential and Structural Features.跨膜同二聚体界面识别：利用序列和结构特征预测α-螺旋跨膜蛋白同二聚体中的界面残基

Int J Mol Sci. 2025 Apr 30;26(9):4270. doi: 10.3390/ijms26094270.

EvoWeaver: large-scale prediction of gene functional associations from coevolutionary signals.EvoWeaver：基于共进化信号的基因功能关联大规模预测

Nat Commun. 2025 Apr 24;16(1):3878. doi: 10.1038/s41467-025-59175-6.

Emerging frontiers in protein structure prediction following the AlphaFold revolution.继AlphaFold革命之后蛋白质结构预测的新兴前沿领域。

J R Soc Interface. 2025 Apr;22(225):20240886. doi: 10.1098/rsif.2024.0886. Epub 2025 Apr 16.

Recent progress and future challenges in structure-based protein-protein interaction prediction.基于结构的蛋白质-蛋白质相互作用预测的最新进展与未来挑战

Mol Ther. 2025 May 7;33(5):2252-2268. doi: 10.1016/j.ymthe.2025.04.003. Epub 2025 Apr 6.

Identification of coevolving positions by ancestral reconstruction.通过祖先重建鉴定协同进化位点。

Commun Biol. 2025 Feb 28;8(1):329. doi: 10.1038/s42003-025-07676-x.

Using residue interaction networks to understand protein function and evolution and to engineer new proteins.利用残基相互作用网络来理解蛋白质的功能和进化，并设计新的蛋白质。

Curr Opin Struct Biol. 2024 Dec;89:102922. doi: 10.1016/j.sbi.2024.102922. Epub 2024 Sep 26.

AlphaFold two years on: Validation and impact.两年后的AlphaFold：验证与影响。

Proc Natl Acad Sci U S A. 2024 Aug 20;121(34):e2315002121. doi: 10.1073/pnas.2315002121. Epub 2024 Aug 12.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

PSICOV：使用基于稀疏逆协方差估计的大型多重序列比对进行精确结构接触预测。

PSICOV: precise structural contact prediction using sparse inverse covariance estimation on large multiple sequence alignments.

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITY

动机

结果

可用性

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献