基于集成学习的磷酸化位点检测特征选择

Ensemble learning-based feature selection for phosphorylation site detection.

作者信息

Liu Songbo, Cui Chengmin, Chen Huipeng, Liu Tong

机构信息

School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China.

Beijing Institute of Control Engineering, China Academy of Space Technology, Beijing, China.

出版信息

Front Genet. 2022 Oct 21;13:984068. doi: 10.3389/fgene.2022.984068. eCollection 2022.

DOI:10.3389/fgene.2022.984068

PMID:36338976

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9634105/

Abstract

SARS-COV-2 is prevalent all over the world, causing more than six million deaths and seriously affecting human health. At present, there is no specific drug against SARS-COV-2. Protein phosphorylation is an important way to understand the mechanism of SARS -COV-2 infection. It is often expensive and time-consuming to identify phosphorylation sites with specific modified residues through experiments. A method that uses machine learning to make predictions about them is proposed. As all the methods of extracting protein sequence features are knowledge-driven, these features may not be effective for detecting phosphorylation sites without a complete understanding of the mechanism of protein. Moreover, redundant features also have a great impact on the fitting degree of the model. To solve these problems, we propose a feature selection method based on ensemble learning, which firstly extracts protein sequence features based on knowledge, then quantifies the importance score of each feature based on data, and finally uses the subset of important features as the final features to predict phosphorylation sites.

摘要

严重急性呼吸综合征冠状病毒2（SARS-CoV-2）在全球广泛传播，已导致超过600万人死亡，严重影响人类健康。目前，尚无针对SARS-CoV-2的特效药物。蛋白质磷酸化是了解SARS-CoV-2感染机制的重要途径。通过实验鉴定具有特定修饰残基的磷酸化位点通常既昂贵又耗时。为此提出了一种利用机器学习对其进行预测的方法。由于所有提取蛋白质序列特征的方法都是知识驱动的，在没有完全理解蛋白质机制的情况下，这些特征可能对检测磷酸化位点无效。此外，冗余特征对模型的拟合度也有很大影响。为了解决这些问题，我们提出了一种基于集成学习的特征选择方法，该方法首先基于知识提取蛋白质序列特征，然后基于数据量化每个特征的重要性得分，最后使用重要特征的子集作为最终特征来预测磷酸化位点。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

基于集成学习的磷酸化位点检测特征选择

Ensemble learning-based feature selection for phosphorylation site detection.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

基于集成学习的磷酸化位点检测特征选择

Ensemble learning-based feature selection for phosphorylation site detection.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献