Department of Zoology, Hansraj College, University of Delhi, Delhi, India.
Department of Biochemistry, Jamia Hamdard, Delhi, India.
BMC Bioinformatics. 2023 Apr 11;24(1):141. doi: 10.1186/s12859-023-05248-6.
Inflammatory mediators play havoc in several diseases including the novel Coronavirus disease 2019 (COVID-19) and generally correlate with the severity of the disease. Interleukin-13 (IL-13), is a pleiotropic cytokine that is known to be associated with airway inflammation in asthma and reactive airway diseases, in neoplastic and autoimmune diseases. Interestingly, the recent association of IL-13 with COVID-19 severity has sparked interest in this cytokine. Therefore characterization of new molecules which can regulate IL-13 induction might lead to novel therapeutics.
Here, we present an improved prediction of IL-13-inducing peptides. The positive and negative datasets were obtained from a recent study (IL13Pred) and the Pfeature algorithm was used to compute features for the peptides. As compared to the state-of-the-art which used the regularization based feature selection technique (linear support vector classifier with the L1 penalty), we used a multivariate feature selection technique (minimum redundancy maximum relevance) to obtain non-redundant and highly relevant features. In the proposed study (improved IL-13 prediction (iIL13Pred)), the use of the mRMR feature selection method is instrumental in choosing the most discriminatory features of IL-13-inducing peptides with improved performance. We investigated seven common machine learning classifiers including Decision Tree, Gaussian Naïve Bayes, k-Nearest Neighbour, Logistic Regression, Support Vector Machine, Random Forest, and extreme gradient boosting to efficiently classify IL-13-inducing peptides. We report improved AUC, and MCC scores of 0.83 and 0.33 on validation data as compared to the current method.
Extensive benchmarking experiments suggest that the proposed method (iIL13Pred) could provide improved performance metrics in terms of sensitivity, specificity, accuracy, the area under the curve - receiver operating characteristics (AUCROC) and Matthews correlation coefficient (MCC) than the existing state-of-the-art approach (IL13Pred) on the validation dataset and an external dataset comprising of experimentally validated IL-13-inducing peptides. Additionally, the experiments were performed with an increased number of experimentally validated training datasets to obtain a more robust model. A user-friendly web server ( www.soodlab.com/iil13pred ) is also designed to facilitate rapid screening of IL-13-inducing peptides.
炎症介质在包括新型冠状病毒病 2019(COVID-19)在内的多种疾病中造成严重破坏,通常与疾病的严重程度相关。白细胞介素 13(IL-13)是一种多效细胞因子,已知与哮喘和反应性气道疾病中的气道炎症、肿瘤和自身免疫性疾病相关。有趣的是,最近发现白细胞介素 13与 COVID-19 严重程度有关,这激发了人们对这种细胞因子的兴趣。因此,鉴定可以调节白细胞介素 13 诱导的新分子可能会导致新的治疗方法。
在这里,我们提出了一种改进的白细胞介素 13 诱导肽预测方法。阳性和阴性数据集来自最近的一项研究(IL13Pred),并使用 Pfeature 算法计算肽的特征。与使用基于正则化的特征选择技术(带 L1 惩罚的线性支持向量分类器)的最新技术相比,我们使用了一种多变量特征选择技术(最小冗余最大相关性)来获得非冗余且高度相关的特征。在本研究(改进的白细胞介素 13 预测(iIL13Pred))中,使用 mRMR 特征选择方法有助于选择白细胞介素 13 诱导肽的最具鉴别力的特征,从而提高性能。我们研究了包括决策树、高斯朴素贝叶斯、k-最近邻、逻辑回归、支持向量机、随机森林和极端梯度提升在内的七种常见机器学习分类器,以有效地对白细胞介素 13 诱导肽进行分类。与当前方法相比,我们在验证数据上报告了改进的 AUC 和 MCC 评分,分别为 0.83 和 0.33。
广泛的基准测试实验表明,与现有的最新方法(IL13Pred)相比,该方法(iIL13Pred)在验证数据集和包含经过实验验证的白细胞介素 13 诱导肽的外部数据集中,在灵敏度、特异性、准确性、曲线下面积-接收器操作特征(AUCROC)和马修斯相关系数(MCC)方面可以提供改进的性能指标。此外,还使用了更多经过实验验证的训练数据集进行实验,以获得更稳健的模型。还设计了一个用户友好的网络服务器(www.soodlab.com/iil13pred),以方便快速筛选白细胞介素 13 诱导肽。