Du Tianchuan, Liao Li, Wu Cathy H, Sun Bilin
Department of Computer and Information Sciences, University of Delaware, Newark, DE, USA.
Department of Computer and Information Sciences, University of Delaware, Newark, DE, USA; Center for Bioinformatics & Computational Biology, University of Delaware, Newark, DE, USA.
Methods. 2016 Nov 1;110:97-105. doi: 10.1016/j.ymeth.2016.06.001. Epub 2016 Jun 6.
Protein-protein interactions play essential roles in many biological processes. Acquiring knowledge of the residue-residue contact information of two interacting proteins is not only helpful in annotating functions for proteins, but also critical for structure-based drug design. The prediction of the protein residue-residue contact matrix of the interfacial regions is challenging. In this work, we introduced deep learning techniques (specifically, stacked autoencoders) to build deep neural network models to tackled the residue-residue contact prediction problem. In tandem with interaction profile Hidden Markov Models, which was used first to extract Fisher score features from protein sequences, stacked autoencoders were deployed to extract and learn hidden abstract features. The deep learning model showed significant improvement over the traditional machine learning model, Support Vector Machines (SVM), with the overall accuracy increased by 15% from 65.40% to 80.82%. We showed that the stacked autoencoders could extract novel features, which can be utilized by deep neural networks and other classifiers to enhance learning, out of the Fisher score features. It is further shown that deep neural networks have significant advantages over SVM in making use of the newly extracted features.
蛋白质-蛋白质相互作用在许多生物过程中起着至关重要的作用。获取两个相互作用蛋白质的残基-残基接触信息不仅有助于注释蛋白质的功能,而且对于基于结构的药物设计也至关重要。预测界面区域的蛋白质残基-残基接触矩阵具有挑战性。在这项工作中,我们引入了深度学习技术(具体来说,堆叠自编码器)来构建深度神经网络模型,以解决残基-残基接触预测问题。与首先用于从蛋白质序列中提取费舍尔评分特征的相互作用谱隐马尔可夫模型相结合,部署堆叠自编码器来提取和学习隐藏的抽象特征。深度学习模型相对于传统机器学习模型支持向量机(SVM)有显著改进,总体准确率从65.40%提高到80.82%,提高了15%。我们表明,堆叠自编码器可以从费舍尔评分特征中提取新特征,这些新特征可被深度神经网络和其他分类器利用以增强学习效果。进一步表明,深度神经网络在利用新提取的特征方面比支持向量机具有显著优势。