Suppr超能文献

基于结构比对和体积分数校正的 DFIRE 能量函数的 DNA 结合蛋白的结构预测。

Structure-based prediction of DNA-binding proteins by structural alignment and a volume-fraction corrected DFIRE-based energy function.

机构信息

School of Informatics, Indiana University Purdue University, Indianapolis, IN 46202, USA.

出版信息

Bioinformatics. 2010 Aug 1;26(15):1857-63. doi: 10.1093/bioinformatics/btq295. Epub 2010 Jun 4.

Abstract

MOTIVATION

Template-based prediction of DNA binding proteins requires not only structural similarity between target and template structures but also prediction of binding affinity between the target and DNA to ensure binding. Here, we propose to predict protein-DNA binding affinity by introducing a new volume-fraction correction to a statistical energy function based on a distance-scaled, finite, ideal-gas reference (DFIRE) state.

RESULTS

We showed that this energy function together with the structural alignment program TM-align achieves the Matthews correlation coefficient (MCC) of 0.76 with an accuracy of 98%, a precision of 93% and a sensitivity of 64%, for predicting DNA binding proteins in a benchmark of 179 DNA binding proteins and 3797 non-binding proteins. The MCC value is substantially higher than the best MCC value of 0.69 given by previous methods. Application of this method to 2235 structural genomics targets uncovered 37 as DNA binding proteins, 27 (73%) of which are putatively DNA binding and only 1 protein whose annotated functions do not contain DNA binding, while the remaining proteins have unknown function. The method provides a highly accurate and sensitive technique for structure-based prediction of DNA binding proteins.

AVAILABILITY

The method is implemented as a part of the Structure-based function-Prediction On-line Tools (SPOT) package available at http://sparks.informatics.iupui.edu/spot

摘要

动机

基于模板的 DNA 结合蛋白预测不仅需要目标和模板结构之间的结构相似性,还需要预测目标与 DNA 之间的结合亲和力,以确保结合。在这里,我们通过引入一种新的体积分数校正方法,对基于距离缩放的有限理想气体参考(DFIRE)状态的统计能量函数进行了改进。

结果

我们表明,该能量函数与结构对齐程序 TM-align 相结合,在 179 个 DNA 结合蛋白和 3797 个非结合蛋白的基准测试中,预测 DNA 结合蛋白的 Matthews 相关系数(MCC)为 0.76,准确率为 98%,精度为 93%,敏感性为 64%。与以前的方法给出的最佳 MCC 值 0.69 相比,该 MCC 值有了显著提高。将该方法应用于 2235 个结构基因组学目标,发现 37 个为 DNA 结合蛋白,其中 27 个(73%)为假定的 DNA 结合蛋白,只有 1 个蛋白质的注释功能不包含 DNA 结合,而其余蛋白质的功能未知。该方法为基于结构的 DNA 结合蛋白预测提供了一种高度准确和敏感的技术。

可用性

该方法作为结构功能在线预测工具(SPOT)包的一部分实现,可在 http://sparks.informatics.iupui.edu/spot 上获得。

相似文献

引用本文的文献

本文引用的文献

1
Exploration of uncharted regions of the protein universe.探索蛋白质宇宙的未知领域。
PLoS Biol. 2009 Sep;7(9):e1000205. doi: 10.1371/journal.pbio.1000205. Epub 2009 Sep 29.
2
The sequence-structure relationship and protein function prediction.序列-结构关系与蛋白质功能预测。
Curr Opin Struct Biol. 2009 Jun;19(3):357-62. doi: 10.1016/j.sbi.2009.03.008. Epub 2009 May 4.
9
Predicting protein function from sequence and structure.从序列和结构预测蛋白质功能。
Nat Rev Mol Cell Biol. 2007 Dec;8(12):995-1005. doi: 10.1038/nrm2281.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验