Suppr超能文献

DeepSS2GO:基于二级结构的蛋白质功能预测

DeepSS2GO: protein function prediction from secondary structure.

作者信息

Song Fu V, Su Jiaqi, Huang Sixing, Zhang Neng, Li Kaiyue, Ni Ming, Liao Maofu

机构信息

Department of Chemical Biology, School of Life Sciences, Southern University of Science and Technology, Xueyuan Avenue, 518055, Shenzhen, China.

Gemini Data Japan, Kitaku Oujikamiya 1-11-11, 115-0043, Tokyo, Japan.

出版信息

Brief Bioinform. 2024 Mar 27;25(3). doi: 10.1093/bib/bbae196.

Abstract

Predicting protein function is crucial for understanding biological life processes, preventing diseases and developing new drug targets. In recent years, methods based on sequence, structure and biological networks for protein function annotation have been extensively researched. Although obtaining a protein in three-dimensional structure through experimental or computational methods enhances the accuracy of function prediction, the sheer volume of proteins sequenced by high-throughput technologies presents a significant challenge. To address this issue, we introduce a deep neural network model DeepSS2GO (Secondary Structure to Gene Ontology). It is a predictor incorporating secondary structure features along with primary sequence and homology information. The algorithm expertly combines the speed of sequence-based information with the accuracy of structure-based features while streamlining the redundant data in primary sequences and bypassing the time-consuming challenges of tertiary structure analysis. The results show that the prediction performance surpasses state-of-the-art algorithms. It has the ability to predict key functions by effectively utilizing secondary structure information, rather than broadly predicting general Gene Ontology terms. Additionally, DeepSS2GO predicts five times faster than advanced algorithms, making it highly applicable to massive sequencing data. The source code and trained models are available at https://github.com/orca233/DeepSS2GO.

摘要

预测蛋白质功能对于理解生物生命过程、预防疾病以及开发新的药物靶点至关重要。近年来,基于序列、结构和生物网络的蛋白质功能注释方法得到了广泛研究。尽管通过实验或计算方法获得蛋白质的三维结构可提高功能预测的准确性,但高通量技术测序的蛋白质数量庞大,这带来了巨大挑战。为解决这一问题,我们引入了一种深度神经网络模型DeepSS2GO(二级结构到基因本体)。它是一种结合了二级结构特征以及一级序列和同源性信息的预测器。该算法巧妙地将基于序列信息的速度与基于结构特征的准确性相结合,同时精简了一级序列中的冗余数据,并绕过了三级结构分析耗时的难题。结果表明,其预测性能超越了现有最先进的算法。它能够通过有效利用二级结构信息来预测关键功能,而不是宽泛地预测一般的基因本体术语。此外,DeepSS2GO的预测速度比先进算法快五倍,使其非常适用于海量测序数据。源代码和训练模型可在https://github.com/orca233/DeepSS2GO获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbd5/11066904/6edd16b7f0d5/bbae196f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验