Suppr超能文献

用于基于序列的蛋白质预测任务的序列表示方法,这些任务使用深度学习。

Sequence representation approaches for sequence-based protein prediction tasks that use deep learning.

作者信息

Cui Feifei, Zhang Zilong, Zou Quan

机构信息

University of Electronic Science and Technology of China, Chengdu, Sichuan, China.

Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, Sichuan, China.

出版信息

Brief Funct Genomics. 2021 Mar 2;20(1):61-73. doi: 10.1093/bfgp/elaa030.

Abstract

Deep learning has been increasingly used in bioinformatics, especially in sequence-based protein prediction tasks, as large amounts of biological data are available and deep learning techniques have been developed rapidly in recent years. For sequence-based protein prediction tasks, the selection of a suitable model architecture is essential, whereas sequence data representation is a major factor in controlling model performance. Here, we summarized all the main approaches that are used to represent protein sequence data (amino acid sequence encoding or embedding), which include end-to-end embedding methods, non-contextual embedding methods and embedding methods that use transfer learning and others that are applied for some specific tasks (such as protein sequence embedding based on extracted features for protein structure predictions and graph convolutional network-based embedding for drug discovery tasks). We have also reviewed the architectures of various types of embedding models theoretically and the development of these types of sequence embedding approaches to facilitate researchers and users in selecting the model that best suits their requirements.

摘要

深度学习在生物信息学中的应用越来越广泛,特别是在基于序列的蛋白质预测任务中,因为近年来有大量的生物数据可用,且深度学习技术发展迅速。对于基于序列的蛋白质预测任务,选择合适的模型架构至关重要,而序列数据表示是控制模型性能的主要因素。在这里,我们总结了用于表示蛋白质序列数据(氨基酸序列编码或嵌入)的所有主要方法,包括端到端嵌入方法、非上下文嵌入方法以及使用迁移学习的嵌入方法和其他适用于某些特定任务的方法(例如基于提取特征的蛋白质序列嵌入用于蛋白质结构预测,以及基于图卷积网络的嵌入用于药物发现任务)。我们还从理论上回顾了各种类型嵌入模型的架构以及这些类型序列嵌入方法的发展,以帮助研究人员和用户选择最适合其需求的模型。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验