IEEE Trans Biomed Eng. 2013 Nov;60(11):2993-3002. doi: 10.1109/TBME.2011.2161306. Epub 2011 Jul 7.
Finding good descriptors, capable of discriminating hotspot residues from others, is still a challenge in many attempts to understand protein interaction. In this paper, descriptors issued from the analysis of amino acid sequences using digital signal processing (DSP) techniques are shown to be as good as those derived from protein tertiary structure and/or information on the complex. The simulation results show that our descriptors can be used separately to predict hotspots, via a random forest classifier, with an accuracy of 79% and a precision of 75%. They can also be used jointly with features derived from tertiary structures to boost the performance up to an accuracy of 82% and a precision of 80%.
在许多试图理解蛋白质相互作用的尝试中,找到能够区分热点残基的良好描述符仍然是一个挑战。在本文中,使用数字信号处理 (DSP) 技术分析氨基酸序列得出的描述符被证明与源自蛋白质三级结构和/或复合物信息的描述符一样好。模拟结果表明,我们的描述符可以通过随机森林分类器分别使用,以 79%的准确率和 75%的精度预测热点。它们还可以与源自三级结构的特征联合使用,将性能提高到 82%的准确率和 80%的精度。