School of Informatics, Indiana University Purdue University, Indianapolis, IN 46202, USA.
Bioinformatics. 2010 Aug 1;26(15):1857-63. doi: 10.1093/bioinformatics/btq295. Epub 2010 Jun 4.
Template-based prediction of DNA binding proteins requires not only structural similarity between target and template structures but also prediction of binding affinity between the target and DNA to ensure binding. Here, we propose to predict protein-DNA binding affinity by introducing a new volume-fraction correction to a statistical energy function based on a distance-scaled, finite, ideal-gas reference (DFIRE) state.
We showed that this energy function together with the structural alignment program TM-align achieves the Matthews correlation coefficient (MCC) of 0.76 with an accuracy of 98%, a precision of 93% and a sensitivity of 64%, for predicting DNA binding proteins in a benchmark of 179 DNA binding proteins and 3797 non-binding proteins. The MCC value is substantially higher than the best MCC value of 0.69 given by previous methods. Application of this method to 2235 structural genomics targets uncovered 37 as DNA binding proteins, 27 (73%) of which are putatively DNA binding and only 1 protein whose annotated functions do not contain DNA binding, while the remaining proteins have unknown function. The method provides a highly accurate and sensitive technique for structure-based prediction of DNA binding proteins.
The method is implemented as a part of the Structure-based function-Prediction On-line Tools (SPOT) package available at http://sparks.informatics.iupui.edu/spot
基于模板的 DNA 结合蛋白预测不仅需要目标和模板结构之间的结构相似性,还需要预测目标与 DNA 之间的结合亲和力,以确保结合。在这里,我们通过引入一种新的体积分数校正方法,对基于距离缩放的有限理想气体参考(DFIRE)状态的统计能量函数进行了改进。
我们表明,该能量函数与结构对齐程序 TM-align 相结合,在 179 个 DNA 结合蛋白和 3797 个非结合蛋白的基准测试中,预测 DNA 结合蛋白的 Matthews 相关系数(MCC)为 0.76,准确率为 98%,精度为 93%,敏感性为 64%。与以前的方法给出的最佳 MCC 值 0.69 相比,该 MCC 值有了显著提高。将该方法应用于 2235 个结构基因组学目标,发现 37 个为 DNA 结合蛋白,其中 27 个(73%)为假定的 DNA 结合蛋白,只有 1 个蛋白质的注释功能不包含 DNA 结合,而其余蛋白质的功能未知。该方法为基于结构的 DNA 结合蛋白预测提供了一种高度准确和敏感的技术。
该方法作为结构功能在线预测工具(SPOT)包的一部分实现,可在 http://sparks.informatics.iupui.edu/spot 上获得。