Yan Yueyang, Shi Zhanpeng, Wei Haijian
College of Veterinary Medicine, Jilin University, Changchun, China.
Department of Organ Transplantation, The Affiliated Yantai Yuhuangding Hospital of Qingdao University, Yantai City, China.
Front Microbiol. 2023 Sep 7;14:1245805. doi: 10.3389/fmicb.2023.1245805. eCollection 2023.
Reactive oxygen species (ROS) are highly reactive molecules that play important roles in microbial biological processes. However, excessive accumulation of ROS can lead to oxidative stress and cellular damage. Microorganism have evolved a diverse suite of enzymes to mitigate the harmful effects of ROS. Accurate prediction of ROS scavenging enzymes classes (ROSes) is crucial for understanding the mechanisms of oxidative stress and developing strategies to combat related diseases. Nevertheless, the existing approaches for categorizing ROS-related proteins exhibit certain drawbacks with regards to their precision and inclusiveness. To address this, we propose a new multi-task deep learning framework called ROSes-FINDER. This framework integrates three component methods using a voting-based approach to predict multiple ROSes properties simultaneously. It can identify whether a given protein sequence is a ROSes and determine its type. The three component methods used in the framework are ROSes-CNN, which extracts raw sequence encoding features, ROSes-NN, which predicts protein functions based on sequence information, and ROSes-XGBoost, which performs functional classification using ensemble machine learning. Comprehensive experiments demonstrate the superior performance and robustness of our method. ROSes-FINDER is freely available at https://github.com/alienn233/ROSes-Finder for predicting ROSes classes.
活性氧(ROS)是具有高反应活性的分子,在微生物生物学过程中发挥着重要作用。然而,ROS的过度积累会导致氧化应激和细胞损伤。微生物已经进化出多种酶来减轻ROS的有害影响。准确预测ROS清除酶类别(ROSes)对于理解氧化应激机制和制定对抗相关疾病的策略至关重要。然而,现有的对ROS相关蛋白进行分类的方法在准确性和包容性方面存在一定缺陷。为了解决这个问题,我们提出了一种名为ROSes-FINDER的新型多任务深度学习框架。该框架使用基于投票的方法整合三种组件方法,以同时预测多种ROSes属性。它可以识别给定的蛋白质序列是否为ROSes并确定其类型。框架中使用的三种组件方法分别是ROSes-CNN,用于提取原始序列编码特征;ROSes-NN,用于基于序列信息预测蛋白质功能;以及ROSes-XGBoost,用于使用集成机器学习进行功能分类。综合实验证明了我们方法的卓越性能和稳健性。ROSes-FINDER可在https://github.com/alienn233/ROSes-Finder上免费获取,用于预测ROSes类别。