Res I, Mihalek I, Lichtarge O
Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, USA.
Bioinformatics. 2005 May 15;21(10):2496-501. doi: 10.1093/bioinformatics/bti340. Epub 2005 Feb 22.
The number of available protein structures still lags far behind the number of known protein sequences. This makes it important to predict which residues participate in protein-protein interactions using only sequence information. Few studies have tackled this problem until now.
We applied support vector machines to sequences in order to generate a classification of all protein residues into those that are part of a protein interface and those that are not. For the first time evolutionary information was used as one of the attributes and this inclusion of evolutionary importance rankings improves the classification. Leave-one-out cross-validation experiments show that prediction accuracy reaches 64%.
可用的蛋白质结构数量仍远远落后于已知蛋白质序列的数量。这使得仅使用序列信息来预测哪些残基参与蛋白质 - 蛋白质相互作用变得很重要。到目前为止,很少有研究解决这个问题。
我们将支持向量机应用于序列,以便将所有蛋白质残基分类为蛋白质界面的一部分和非蛋白质界面的一部分。首次将进化信息用作属性之一,并且这种对进化重要性排名的纳入提高了分类效果。留一法交叉验证实验表明预测准确率达到64%。