利用深度神经网络实现近乎完美的蛋白质多标签分类。

Near perfect protein multi-label classification with deep neural networks.

机构信息

PIT Bioinformatics Group, Eötvös University, H-1117 Budapest, Hungary.

PIT Bioinformatics Group, Eötvös University, H-1117 Budapest, Hungary; Uratim Ltd., H-1118 Budapest, Hungary.

出版信息

Methods. 2018 Jan 1;132:50-56. doi: 10.1016/j.ymeth.2017.06.034. Epub 2017 Jul 3.

DOI:10.1016/j.ymeth.2017.06.034

PMID:28684341

Abstract

Biological sequences can be considered as data items of high-, non-fixed dimensions, corresponding to the length of those sequences. The comparison and the classification of biological sequences in their relations to large databases are important areas of research today. Artificial neural networks (ANNs) have gained a well-deserved popularity among machine learning tools upon their recent successful applications in image- and sound processing and classification problems. ANNs have also been applied for predicting the family or function of a protein, knowing its residue sequence. Here we present two new ANNs with multi-label classification ability, showing impressive accuracy when classifying protein sequences into 698 UniProt families (AUC=99.99%) and 983 Gene Ontology classes (AUC=99.45%).

摘要

生物序列可以被视为具有高维度、非固定维度的数据项，对应于序列的长度。在将生物序列与其大型数据库的关系进行比较和分类方面，这是当今的重要研究领域。人工神经网络 (ANN) 在最近成功应用于图像处理和声音处理以及分类问题之后，在机器学习工具中获得了当之无愧的普及。ANN 也已被用于预测蛋白质的家族或功能，只需知道其残基序列。在这里，我们提出了两种具有多标签分类能力的新 ANN，在将蛋白质序列分类为 698 个 UniProt 家族（AUC=99.99%）和 983 个 Gene Ontology 类（AUC=99.45%）时，表现出令人印象深刻的准确性。

相似文献

Near perfect protein multi-label classification with deep neural networks.利用深度神经网络实现近乎完美的蛋白质多标签分类。

Methods. 2018 Jan 1;132:50-56. doi: 10.1016/j.ymeth.2017.06.034. Epub 2017 Jul 3.

SECLAF: a webserver and deep neural network design tool for hierarchical biological sequence classification.SECLAF：一个用于分层生物序列分类的网络服务器和深度神经网络设计工具。

Bioinformatics. 2018 Jul 15;34(14):2487-2489. doi: 10.1093/bioinformatics/bty116.

Predicting human protein function with multi-task deep neural networks.用多任务深度神经网络预测人类蛋白质功能。

PLoS One. 2018 Jun 11;13(6):e0198216. doi: 10.1371/journal.pone.0198216. eCollection 2018.

Improving GENCODE reference gene annotation using a high-stringency proteogenomics workflow.利用高严格性的蛋白质基因组学工作流程改进 GENCODE 参考基因注释。

Nat Commun. 2016 Jun 2;7:11778. doi: 10.1038/ncomms11778.

Deep Convolutional Neural Networks for Endotracheal Tube Position and X-ray Image Classification: Challenges and Opportunities.用于气管插管位置和X射线图像分类的深度卷积神经网络：挑战与机遇

J Digit Imaging. 2017 Aug;30(4):460-468. doi: 10.1007/s10278-017-9980-7.

MapReduce Based Parallel Neural Networks in Enabling Large Scale Machine Learning.基于MapReduce的并行神经网络助力大规模机器学习。

Comput Intell Neurosci. 2015;2015:297672. doi: 10.1155/2015/297672. Epub 2015 Nov 22.

Channel selection and classification of electroencephalogram signals: an artificial neural network and genetic algorithm-based approach.脑电信号的通道选择与分类：基于人工神经网络和遗传算法的方法。

Artif Intell Med. 2012 Jun;55(2):117-26. doi: 10.1016/j.artmed.2012.02.001. Epub 2012 Apr 12.

Predicting three-year kidney graft survival in recipients with systemic lupus erythematosus.预测系统性红斑狼疮患者肾移植 3 年的存活率。

ASAIO J. 2011 Jul-Aug;57(4):300-9. doi: 10.1097/MAT.0b013e318222db30.

Applications of artificial neural networks (ANNs) in food science.人工神经网络在食品科学中的应用。

Crit Rev Food Sci Nutr. 2007;47(2):113-26. doi: 10.1080/10408390600626453.

Using Proteomics Bioinformatics Tools and Resources in Proteogenomic Studies.在蛋白质基因组学研究中使用蛋白质组学生物信息学工具和资源。

Adv Exp Med Biol. 2016;926:65-75. doi: 10.1007/978-3-319-42316-6_5.

引用本文的文献

Learning maximally spanning representations improves protein function annotation.学习最大生成表示可改善蛋白质功能注释。

bioRxiv. 2025 Feb 17:2025.02.13.638156. doi: 10.1101/2025.02.13.638156.

The applications of deep learning algorithms on in silico druggable proteins identification.深度学习算法在虚拟可成药蛋白识别中的应用。

J Adv Res. 2022 Nov;41:219-231. doi: 10.1016/j.jare.2022.01.009. Epub 2022 Jan 22.

Protein Science Meets Artificial Intelligence: A Systematic Review and a Biochemical Meta-Analysis of an Inter-Field.蛋白质科学与人工智能相遇：跨领域的系统评价与生化荟萃分析

Front Bioeng Biotechnol. 2022 Jul 7;10:788300. doi: 10.3389/fbioe.2022.788300. eCollection 2022.

Identifying super-feminine, super-masculine and sex-defining connections in the human braingraph.识别人类脑图谱中超级女性化、超级男性化和性别定义连接。

Cogn Neurodyn. 2021 Dec;15(6):949-959. doi: 10.1007/s11571-021-09687-w. Epub 2021 Jul 15.

An improved deep learning model for hierarchical classification of protein families.一种用于蛋白质家族层次分类的改进型深度学习模型。

PLoS One. 2021 Oct 20;16(10):e0258625. doi: 10.1371/journal.pone.0258625. eCollection 2021.

DeepT3 2.0: improving type III secreted effector predictions by an integrative deep learning framework.DeepT3 2.0：通过集成深度学习框架改进III型分泌效应蛋白预测

NAR Genom Bioinform. 2021 Oct 4;3(4):lqab086. doi: 10.1093/nargab/lqab086. eCollection 2021 Dec.

Protein function prediction with gene ontology: from traditional to deep learning models.利用基因本体进行蛋白质功能预测：从传统模型到深度学习模型

PeerJ. 2021 Aug 24;9:e12019. doi: 10.7717/peerj.12019. eCollection 2021.

Multimodal deep representation learning for protein interaction identification and protein family classification.基于多模态深度表示学习的蛋白质相互作用识别和蛋白质家族分类。

BMC Bioinformatics. 2019 Dec 2;20(Suppl 16):531. doi: 10.1186/s12859-019-3084-y.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

利用深度神经网络实现近乎完美的蛋白质多标签分类。

Near perfect protein multi-label classification with deep neural networks.

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献