Suppr超能文献

利用融合特征和机器学习方法准确识别细胞凋亡的正负调控。

Accurately identifying positive and negative regulation of apoptosis using fusion features and machine learning methods.

作者信息

Wu Cheng-Yan, Xu Zhi-Xue, Li Nan, Qi Dan-Yang, Hao Zhi-Hong, Wu Hong-Ye, Gao Ru, Jin Yan-Ting

机构信息

Key Laboratory of Magnetism and Magnetic Materials at Universities of Inner Mongolia Autonomous Region, Baotou Teacher's College, Baotou 014010, China.

The People's Hospital of Wenjiang, Chengdu, Sichuan 611130, China.

出版信息

Comput Biol Chem. 2024 Dec;113:108207. doi: 10.1016/j.compbiolchem.2024.108207. Epub 2024 Sep 11.

Abstract

Apoptotic proteins play a crucial role in the apoptosis process, ensuring a balance between cell proliferation and death. Thus, further elucidating the regulatory mechanisms of apoptosis will enhance our understanding of their functions. However, the development of computational methods to accurately identify positive and negative regulation of apoptosis remains a significant challenge. This work proposes a machine learning model based on multi-feature fusion to effectively identify the roles of positive and negative regulation of apoptosis. Initially, we constructed a reliable benchmark dataset containing 200 positive regulation of apoptosis and 241 negative regulation of apoptosis proteins. Subsequently, we developed a classifier that combines the support vector machine (SVM) with pseudo composition of k-spaced amino acid pairs (PseCKSAAP), composition transition distribution (CTD), dipeptide deviation from expected mean (DDE), and PSSM-composition to identify these proteins. Analysis of variance (ANOVA) was employed to select optimized features that could yield the maximum prediction performance. Evaluating the proposed model on independent data revealed and achieved an accuracy of 0.781 with an AUROC of 0.837, demonstrating our model's potent capabilities.

摘要

凋亡蛋白在凋亡过程中起着关键作用,确保细胞增殖与死亡之间的平衡。因此,进一步阐明凋亡的调控机制将增进我们对其功能的理解。然而,开发能够准确识别凋亡正负调控的计算方法仍然是一项重大挑战。这项工作提出了一种基于多特征融合的机器学习模型,以有效识别凋亡正负调控的作用。首先,我们构建了一个可靠的基准数据集,其中包含200个凋亡正调控蛋白和241个凋亡负调控蛋白。随后,我们开发了一种分类器,将支持向量机(SVM)与k间隔氨基酸对的伪组成(PseCKSAAP)、组成转变分布(CTD)、二肽与预期均值的偏差(DDE)以及位置特异性得分矩阵组成(PSSM-composition)相结合来识别这些蛋白。采用方差分析(ANOVA)来选择能够产生最大预测性能的优化特征。在独立数据上评估所提出的模型,结果显示准确率为0.781,曲线下面积(AUROC)为0.837,证明了我们模型的强大能力。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验