Suppr超能文献

MUFOLD-SS:用于蛋白质二级结构预测的新深度 inception-inside-inception 网络。

MUFOLD-SS: New deep inception-inside-inception networks for protein secondary structure prediction.

机构信息

Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, Missouri.

Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, Missouri.

出版信息

Proteins. 2018 May;86(5):592-598. doi: 10.1002/prot.25487. Epub 2018 Mar 12.

Abstract

Protein secondary structure prediction can provide important information for protein 3D structure prediction and protein functions. Deep learning offers a new opportunity to significantly improve prediction accuracy. In this article, a new deep neural network architecture, named the Deep inception-inside-inception (Deep3I) network, is proposed for protein secondary structure prediction and implemented as a software tool MUFOLD-SS. The input to MUFOLD-SS is a carefully designed feature matrix corresponding to the primary amino acid sequence of a protein, which consists of a rich set of information derived from individual amino acid, as well as the context of the protein sequence. Specifically, the feature matrix is a composition of physio-chemical properties of amino acids, PSI-BLAST profile, and HHBlits profile. MUFOLD-SS is composed of a sequence of nested inception modules and maps the input matrix to either eight states or three states of secondary structures. The architecture of MUFOLD-SS enables effective processing of local and global interactions between amino acids in making accurate prediction. In extensive experiments on multiple datasets, MUFOLD-SS outperformed the best existing methods and other deep neural networks significantly. MUFold-SS can be downloaded from http://dslsrv8.cs.missouri.edu/~cf797/MUFoldSS/download.html.

摘要

蛋白质二级结构预测可为蛋白质 3D 结构预测和蛋白质功能提供重要信息。深度学习为显著提高预测准确性提供了新的机会。本文提出了一种新的深度神经网络架构,名为深度 inception-inside-inception(Deep3I)网络,用于蛋白质二级结构预测,并作为软件工具 MUFOLD-SS 实现。MUFOLD-SS 的输入是一个精心设计的特征矩阵,对应于蛋白质的一级氨基酸序列,它由从单个氨基酸以及蛋白质序列的上下文派生的丰富信息集组成。具体来说,特征矩阵是氨基酸理化性质、PSI-BLAST 轮廓和 HHBlits 轮廓的组合。MUFOLD-SS 由嵌套 inception 模块序列组成,将输入矩阵映射到二级结构的八个状态或三个状态之一。MUFOLD-SS 的架构能够有效地处理氨基酸之间的局部和全局相互作用,从而进行准确的预测。在多个数据集上的广泛实验中,MUFOLD-SS 明显优于最佳现有方法和其他深度神经网络。MUFold-SS 可从 http://dslsrv8.cs.missouri.edu/~cf797/MUFoldSS/download.html 下载。

相似文献

9
A Deep Learning Network Approach to ab initio Protein Secondary Structure Prediction.一种用于从头预测蛋白质二级结构的深度学习网络方法。
IEEE/ACM Trans Comput Biol Bioinform. 2015 Jan-Feb;12(1):103-12. doi: 10.1109/TCBB.2014.2343960. Epub 2014 Aug 7.

引用本文的文献

本文引用的文献

4
JPred4: a protein secondary structure prediction server.JPred4:一种蛋白质二级结构预测服务器。
Nucleic Acids Res. 2015 Jul 1;43(W1):W389-94. doi: 10.1093/nar/gkv332. Epub 2015 Apr 16.
6
CD-HIT: accelerated for clustering the next-generation sequencing data.CD-HIT:用于加速下一代测序数据聚类的工具。
Bioinformatics. 2012 Dec 1;28(23):3150-2. doi: 10.1093/bioinformatics/bts565. Epub 2012 Oct 11.
7
Domain enhanced lookup time accelerated BLAST.基于域名的快速检索 BLAST。
Biol Direct. 2012 Apr 17;7:12. doi: 10.1186/1745-6150-7-12.
9
Sequence context-specific profiles for homology searching.用于同源性搜索的序列上下文特定概况。
Proc Natl Acad Sci U S A. 2009 Mar 10;106(10):3770-5. doi: 10.1073/pnas.0810767106. Epub 2009 Feb 20.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验