Suppr超能文献

通过深度双向长短期记忆循环神经网络改进蛋白质无序预测。

Improving protein disorder prediction by deep bidirectional long short-term memory recurrent neural networks.

作者信息

Hanson Jack, Yang Yuedong, Paliwal Kuldip, Zhou Yaoqi

机构信息

Signal Processing Laboratory, Griffith University, Brisbane 4122, Australia.

Institute for Glycomics, Griffith University, Gold Coast 4215, Australia.

出版信息

Bioinformatics. 2017 Mar 1;33(5):685-692. doi: 10.1093/bioinformatics/btw678.

Abstract

MOTIVATION

Capturing long-range interactions between structural but not sequence neighbors of proteins is a long-standing challenging problem in bioinformatics. Recently, long short-term memory (LSTM) networks have significantly improved the accuracy of speech and image classification problems by remembering useful past information in long sequential events. Here, we have implemented deep bidirectional LSTM recurrent neural networks in the problem of protein intrinsic disorder prediction.

RESULTS

The new method, named SPOT-Disorder, has steadily improved over a similar method using a traditional, window-based neural network (SPINE-D) in all datasets tested without separate training on short and long disordered regions. Independent tests on four other datasets including the datasets from critical assessment of structure prediction (CASP) techniques and >10 000 annotated proteins from MobiDB, confirmed SPOT-Disorder as one of the best methods in disorder prediction. Moreover, initial studies indicate that the method is more accurate in predicting functional sites in disordered regions. These results highlight the usefulness combining LSTM with deep bidirectional recurrent neural networks in capturing non-local, long-range interactions for bioinformatics applications.

AVAILABILITY AND IMPLEMENTATION

SPOT-disorder is available as a web server and as a standalone program at: http://sparks-lab.org/server/SPOT-disorder/index.php .

CONTACT

j.hanson@griffith.edu.au or yuedong.yang@griffith.edu.au or yaoqi.zhou@griffith.edu.au.

SUPPLEMENTARY INFORMATION

Supplementary data is available at Bioinformatics online.

摘要

动机

捕捉蛋白质结构而非序列邻域之间的长程相互作用是生物信息学中一个长期存在的挑战性问题。最近,长短期记忆(LSTM)网络通过在长序列事件中记住有用的过去信息,显著提高了语音和图像分类问题的准确性。在此,我们在蛋白质内在无序预测问题中实现了深度双向LSTM递归神经网络。

结果

这种名为SPOT-Disorder的新方法,在所有测试数据集中,相较于使用传统基于窗口的神经网络的类似方法(SPINE-D)稳步提升,且无需对短程和长程无序区域进行单独训练。对包括结构预测关键评估(CASP)技术数据集以及来自MobiDB的超过10000个注释蛋白质的其他四个数据集进行的独立测试,证实SPOT-Disorder是无序预测中最佳方法之一。此外,初步研究表明该方法在预测无序区域的功能位点时更准确。这些结果突出了将LSTM与深度双向递归神经网络相结合在捕捉生物信息学应用中的非局部、长程相互作用方面的有用性。

可用性与实现方式

SPOT-disorder可作为网络服务器和独立程序获取,网址为:http://sparks-lab.org/server/SPOT-disorder/index.php

联系方式

j.hanson@griffith.edu.auyuedong.yang@griffith.edu.auyaoqi.zhou@griffith.edu.au

补充信息

补充数据可在《生物信息学》在线获取。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验