PredIL13：结合多种机器和深度学习方法以及 ESM-2 语言模型，用于识别诱导 IL13 的肽。

PredIL13: Stacking a variety of machine and deep learning methods with ESM-2 language model for identifying IL13-inducing peptides.

机构信息

Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, Kawazu, Iizuka, Fukuoka, Japan.

出版信息

PLoS One. 2024 Aug 22;19(8):e0309078. doi: 10.1371/journal.pone.0309078. eCollection 2024.

DOI:10.1371/journal.pone.0309078

PMID:39172871

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11340954/

Abstract

Interleukin (IL)-13 has emerged as one of the recently identified cytokine. Since IL-13 causes the severity of COVID-19 and alters crucial biological processes, it is urgent to explore novel molecules or peptides capable of including IL-13. Computational prediction has received attention as a complementary method to in-vivo and in-vitro experimental identification of IL-13 inducing peptides, because experimental identification is time-consuming, laborious, and expensive. A few computational tools have been presented, including the IL13Pred and iIL13Pred. To increase prediction capability, we have developed PredIL13, a cutting-edge ensemble learning method with the latest ESM-2 protein language model. This method stacked the probability scores outputted by 168 single-feature machine/deep learning models, and then trained a logistic regression-based meta-classifier with the stacked probability score vectors. The key technology was to implement ESM-2 and to select the optimal single-feature models according to their absolute weight coefficient for logistic regression (AWCLR), an indicator of the importance of each single-feature model. Especially, the sequential deletion of single-feature models based on the iterative AWCLR ranking (SDIWC) method constructed the meta-classifier consisting of the top 16 single-feature models, named PredIL13, while considering the model's accuracy. The PredIL13 greatly outperformed the-state-of-the-art predictors, thus is an invaluable tool for accelerating the detection of IL13-inducing peptide within the human genome.

摘要

白细胞介素 (IL)-13 已成为最近确定的细胞因子之一。由于 IL-13 导致 COVID-19 的严重程度并改变关键的生物学过程，因此迫切需要探索能够包括 IL-13 的新型分子或肽。计算预测作为体内和体外实验鉴定 IL-13 诱导肽的补充方法受到了关注，因为实验鉴定既耗时、费力又昂贵。已经提出了一些计算工具，包括 IL13Pred 和 iIL13Pred。为了提高预测能力，我们开发了 PredIL13，这是一种基于最新 ESM-2 蛋白质语言模型的前沿集成学习方法。该方法堆叠了 168 个单特征机器/深度学习模型输出的概率得分，然后使用堆叠的概率得分向量训练基于逻辑回归的元分类器。关键技术是实现 ESM-2，并根据其用于逻辑回归的绝对权重系数 (AWCLR) 为每个单特征模型选择最佳单特征模型，这是每个单特征模型重要性的指标。特别是，基于迭代 AWCLR 排序的单特征模型序列删除 (SDIWC) 方法构建了由前 16 个单特征模型组成的元分类器，名为 PredIL13，同时考虑了模型的准确性。PredIL13 大大优于最先进的预测器，因此是加速在人类基因组中检测 IL13 诱导肽的宝贵工具。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

PredIL13：结合多种机器和深度学习方法以及 ESM-2 语言模型，用于识别诱导 IL13 的肽。

PredIL13: Stacking a variety of machine and deep learning methods with ESM-2 language model for identifying IL13-inducing peptides.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

PredIL13：结合多种机器和深度学习方法以及 ESM-2 语言模型，用于识别诱导 IL13 的肽。

PredIL13: Stacking a variety of machine and deep learning methods with ESM-2 language model for identifying IL13-inducing peptides.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献