Suppr超能文献

通过深度学习网络改进蛋白质折叠识别

Improving Protein Fold Recognition by Deep Learning Networks.

作者信息

Jo Taeho, Hou Jie, Eickholt Jesse, Cheng Jianlin

机构信息

Department of Computer Science, University of Missouri, Columbia, MO 65211, USA.

Department of Biological Chemistry, University of Michigan, Ann Arbor, MI, 48109, USA.

出版信息

Sci Rep. 2015 Dec 4;5:17573. doi: 10.1038/srep17573.

Abstract

For accurate recognition of protein folds, a deep learning network method (DN-Fold) was developed to predict if a given query-template protein pair belongs to the same structural fold. The input used stemmed from the protein sequence and structural features extracted from the protein pair. We evaluated the performance of DN-Fold along with 18 different methods on Lindahl's benchmark dataset and on a large benchmark set extracted from SCOP 1.75 consisting of about one million protein pairs, at three different levels of fold recognition (i.e., protein family, superfamily, and fold) depending on the evolutionary distance between protein sequences. The correct recognition rate of ensembled DN-Fold for Top 1 predictions is 84.5%, 61.5%, and 33.6% and for Top 5 is 91.2%, 76.5%, and 60.7% at family, superfamily, and fold levels, respectively. We also evaluated the performance of single DN-Fold (DN-FoldS), which showed the comparable results at the level of family and superfamily, compared to ensemble DN-Fold. Finally, we extended the binary classification problem of fold recognition to real-value regression task, which also show a promising performance. DN-Fold is freely available through a web server at http://iris.rnet.missouri.edu/dnfold.

摘要

为了准确识别蛋白质折叠,开发了一种深度学习网络方法(DN-Fold),以预测给定的查询模板蛋白质对是否属于相同的结构折叠。所使用的输入源自蛋白质序列以及从蛋白质对中提取的结构特征。我们在林达尔基准数据集以及从SCOP 1.75提取的包含约一百万个蛋白质对的大型基准集上,根据蛋白质序列之间的进化距离,在三种不同的折叠识别水平(即蛋白质家族、超家族和折叠)下,评估了DN-Fold以及18种不同方法的性能。在家族、超家族和折叠水平上,集成DN-Fold的Top 1预测正确识别率分别为84.5%、61.5%和33.6%,Top 5预测正确识别率分别为91.2%、76.5%和60.7%。我们还评估了单个DN-Fold(DN-FoldS)的性能,与集成DN-Fold相比,它在家族和超家族水平上显示出可比的结果。最后,我们将折叠识别的二元分类问题扩展到实值回归任务,其也表现出了有前景的性能。可通过网页服务器http://iris.rnet.missouri.edu/dnfold免费获取DN-Fold。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9e76/4669437/4e69804e382a/srep17573-f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验