Suppr超能文献

基于层次化和全局特征组合的蛋白质序列 EC 数预测。

EC number prediction of protein sequences based on combination of hierarchical and global features.

机构信息

School of technology, Beijing Forestry University, Beijing 100083, China.

Key Lab of State Forestry Administration for Forestry Equipment and Automation, Beijing 100083, China.

出版信息

Yi Chuan. 2024 Aug;46(8):661-669. doi: 10.16288/j.yczz.24-102.

Abstract

The identification of enzyme functions plays a crucial role in understanding the mechanisms of biological activities and advancing the development of life sciences. However, existing enzyme EC number prediction methods did not fully utilize protein sequence information and still had shortcomings in identification accuracy. To address this issue, we proposed an EC number prediction network using hierarchical features and global features (ECPN-HFGF). This method first utilized residual networks to extract generic features from protein sequences, and then employed hierarchical feature extraction modules and global feature extraction modules to further extract hierarchical and global features of protein sequences. Subsequently, the prediction results of both feature types were combined, and a multitask learning framework was utilized to achieve accurate prediction of enzyme EC numbers. Experimental results indicated that the ECPN-HFGF method performed best in the task of predicting EC numbers for protein sequences, achieving macro F1 and micro F1 scores of 95.5% and 99.0%, respectively. The ECPN-HFGF method effectively combined hierarchical and global features of protein sequences, allowing for rapid and accurate EC number prediction. Compared to current commonly used methods, this method offers significantly higher prediction accuracy, providing an efficient approach for the advancement of enzymology research and enzyme engineering applications.

摘要

酶功能的鉴定在理解生物活性的机制和推进生命科学的发展方面起着至关重要的作用。然而,现有的酶 EC 编号预测方法并没有充分利用蛋白质序列信息,在识别准确性方面仍存在不足。为了解决这个问题,我们提出了一种使用层次特征和全局特征的 EC 编号预测网络(ECPN-HFGF)。该方法首先利用残差网络从蛋白质序列中提取通用特征,然后利用层次特征提取模块和全局特征提取模块进一步提取蛋白质序列的层次和全局特征。随后,将两种特征类型的预测结果进行组合,并利用多任务学习框架实现酶 EC 编号的准确预测。实验结果表明,ECPN-HFGF 方法在蛋白质序列 EC 编号预测任务中表现最佳,宏 F1 和微 F1 得分分别达到 95.5%和 99.0%。ECPN-HFGF 方法有效地结合了蛋白质序列的层次和全局特征,能够实现快速准确的 EC 编号预测。与当前常用的方法相比,该方法具有更高的预测准确性,为酶学研究和酶工程应用的发展提供了一种高效的方法。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验