Suppr超能文献

PSICOV:使用基于稀疏逆协方差估计的大型多重序列比对进行精确结构接触预测。

PSICOV: precise structural contact prediction using sparse inverse covariance estimation on large multiple sequence alignments.

机构信息

Department of Computer Science, Bioinformatics Group, Centre for Computational Statistics and Machine Learning, University College London, Malet Place, London WC1E 6BT, UK.

出版信息

Bioinformatics. 2012 Jan 15;28(2):184-90. doi: 10.1093/bioinformatics/btr638. Epub 2011 Nov 17.

Abstract

MOTIVATION

The accurate prediction of residue-residue contacts, critical for maintaining the native fold of a protein, remains an open problem in the field of structural bioinformatics. Interest in this long-standing problem has increased recently with algorithmic improvements and the rapid growth in the sizes of sequence families. Progress could have major impacts in both structure and function prediction to name but two benefits. Sequence-based contact predictions are usually made by identifying correlated mutations within multiple sequence alignments (MSAs), most commonly through the information-theoretic approach of calculating mutual information between pairs of sites in proteins. These predictions are often inaccurate because the true covariation signal in the MSA is often masked by biases from many ancillary indirect-coupling or phylogenetic effects. Here we present a novel method, PSICOV, which introduces the use of sparse inverse covariance estimation to the problem of protein contact prediction. Our method builds on work which had previously demonstrated corrections for phylogenetic and entropic correlation noise and allows accurate discrimination of direct from indirectly coupled mutation correlations in the MSA.

RESULTS

PSICOV displays a mean precision substantially better than the best performing normalized mutual information approach and Bayesian networks. For 118 out of 150 targets, the L/5 (i.e. top-L/5 predictions for a protein of length L) precision for long-range contacts (sequence separation >23) was ≥ 0.5, which represents an improvement sufficient to be of significant benefit in protein structure prediction or model quality assessment.

AVAILABILITY

The PSICOV source code can be downloaded from http://bioinf.cs.ucl.ac.uk/downloads/PSICOV.

摘要

动机

残基残基接触的准确预测对于维持蛋白质的天然折叠至关重要,这仍然是结构生物信息学领域的一个未解决的问题。随着算法的改进和序列家族规模的快速增长,人们对这个长期存在的问题的兴趣最近有所增加。这一进展可能对结构和功能预测产生重大影响,仅举两个好处。基于序列的接触预测通常是通过在多个序列比对(MSA)中识别相关突变来实现的,最常见的方法是通过计算蛋白质中对位点之间的互信息来计算信息论方法。这些预测通常不准确,因为 MSA 中的真实共变信号经常被许多辅助间接耦合或系统发育效应的偏差所掩盖。在这里,我们提出了一种新的方法 PSICOV,它将稀疏逆协方差估计引入到蛋白质接触预测问题中。我们的方法基于先前已经证明的用于校正系统发育和熵相关噪声的工作,并允许在 MSA 中准确区分直接和间接耦合突变相关性。

结果

PSICOV 的平均精度明显优于表现最好的归一化互信息方法和贝叶斯网络。对于 150 个目标中的 118 个,长程接触(序列分离>23)的 L/5(即蛋白质长度 L 的前 L/5 个预测)精度≥0.5,这足以显著提高蛋白质结构预测或模型质量评估的精度。

可用性

PSICOV 的源代码可以从 http://bioinf.cs.ucl.ac.uk/downloads/PSICOV 下载。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验