Suppr超能文献

利用深度表征学习特征鉴定高尔基体亚结构蛋白定位

Identification of sub-Golgi protein localization by use of deep representation learning features.

作者信息

Lv Zhibin, Wang Pingping, Zou Quan, Jiang Qinghua

机构信息

Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China.

Center for Bioinformatics, School of Life Science and Technology, Harbin Institute of Technology, Harbin 150000, China.

出版信息

Bioinformatics. 2021 Apr 5;36(24):5600-5609. doi: 10.1093/bioinformatics/btaa1074.

Abstract

MOTIVATION

The Golgi apparatus has a key functional role in protein biosynthesis within the eukaryotic cell with malfunction resulting in various neurodegenerative diseases. For a better understanding of the Golgi apparatus, it is essential to identification of sub-Golgi protein localization. Although some machine learning methods have been used to identify sub-Golgi localization proteins by sequence representation fusion, more accurate sub-Golgi protein identification is still challenging by existing methodology.

RESULTS

we developed a protein sub-Golgi localization identification protocol using deep representation learning features with 107 dimensions. By this protocol, we demonstrated that instead of multi-type protein sequence feature representation fusion as in previous state-of-the-art sub-Golgi-protein localization classifiers, it is sufficient to exploit only one type of feature representation for more accurately identification of sub-Golgi proteins. Compared with independent testing results for benchmark datasets, our protocol is able to perform generally, reliably and robustly for sub-Golgi protein localization prediction.

AVAILABILITYAND IMPLEMENTATION

A use-friendly webserver is freely accessible at http://isGP-DRLF.aibiochem.net and the prediction code is accessible at https://github.com/zhibinlv/isGP-DRLF.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

高尔基体在真核细胞内的蛋白质生物合成中具有关键的功能作用,其功能异常会导致各种神经退行性疾病。为了更好地理解高尔基体,确定高尔基体亚结构蛋白的定位至关重要。尽管一些机器学习方法已被用于通过序列表示融合来识别高尔基体亚结构定位蛋白,但现有方法在更准确地识别高尔基体亚结构蛋白方面仍具有挑战性。

结果

我们开发了一种使用107维深度表示学习特征的蛋白质高尔基体亚结构定位识别方案。通过该方案,我们证明,与先前最先进的高尔基体亚结构蛋白定位分类器中使用的多类型蛋白质序列特征表示融合不同,仅利用一种类型的特征表示就足以更准确地识别高尔基体亚结构蛋白。与基准数据集的独立测试结果相比,我们的方案能够对高尔基体亚结构蛋白定位预测进行总体、可靠且稳健的执行。

可用性和实现方式

可通过http://isGP-DRLF.aibiochem.net免费访问一个用户友好的网络服务器,预测代码可在https://github.com/zhibinlv/isGP-DRLF获取。

补充信息

补充数据可在《生物信息学》在线版获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2670/8023683/7456c7ac2394/btaa1074f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验