Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, Missouri.
Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, Missouri.
Proteins. 2018 May;86(5):592-598. doi: 10.1002/prot.25487. Epub 2018 Mar 12.
Protein secondary structure prediction can provide important information for protein 3D structure prediction and protein functions. Deep learning offers a new opportunity to significantly improve prediction accuracy. In this article, a new deep neural network architecture, named the Deep inception-inside-inception (Deep3I) network, is proposed for protein secondary structure prediction and implemented as a software tool MUFOLD-SS. The input to MUFOLD-SS is a carefully designed feature matrix corresponding to the primary amino acid sequence of a protein, which consists of a rich set of information derived from individual amino acid, as well as the context of the protein sequence. Specifically, the feature matrix is a composition of physio-chemical properties of amino acids, PSI-BLAST profile, and HHBlits profile. MUFOLD-SS is composed of a sequence of nested inception modules and maps the input matrix to either eight states or three states of secondary structures. The architecture of MUFOLD-SS enables effective processing of local and global interactions between amino acids in making accurate prediction. In extensive experiments on multiple datasets, MUFOLD-SS outperformed the best existing methods and other deep neural networks significantly. MUFold-SS can be downloaded from http://dslsrv8.cs.missouri.edu/~cf797/MUFoldSS/download.html.
蛋白质二级结构预测可为蛋白质 3D 结构预测和蛋白质功能提供重要信息。深度学习为显著提高预测准确性提供了新的机会。本文提出了一种新的深度神经网络架构,名为深度 inception-inside-inception(Deep3I)网络,用于蛋白质二级结构预测,并作为软件工具 MUFOLD-SS 实现。MUFOLD-SS 的输入是一个精心设计的特征矩阵,对应于蛋白质的一级氨基酸序列,它由从单个氨基酸以及蛋白质序列的上下文派生的丰富信息集组成。具体来说,特征矩阵是氨基酸理化性质、PSI-BLAST 轮廓和 HHBlits 轮廓的组合。MUFOLD-SS 由嵌套 inception 模块序列组成,将输入矩阵映射到二级结构的八个状态或三个状态之一。MUFOLD-SS 的架构能够有效地处理氨基酸之间的局部和全局相互作用,从而进行准确的预测。在多个数据集上的广泛实验中,MUFOLD-SS 明显优于最佳现有方法和其他深度神经网络。MUFold-SS 可从 http://dslsrv8.cs.missouri.edu/~cf797/MUFoldSS/download.html 下载。