Computational Biology Laboratory, Department of Genetic Engineering, School of Bio-Engineering, SRM Institute of Science and Technology, Kattankulathur, Chengalpattu, Tamil Nadu, 603203, India.
Department of Chemistry, Department of Carbon Materials, Chosun University, Gwangju, South Korea.
Mol Divers. 2024 Aug;28(4):2153-2161. doi: 10.1007/s11030-024-10823-x. Epub 2024 Mar 30.
Cancer, being the second leading cause of death globally. So, the development of effective anticancer treatments is crucial in the field of medicine. Anticancer peptides (ACPs) have shown promising therapeutic potential in cancer treatment compared to traditional methods. However, the process of identifying ACPs through experimental means is often time-intensive and expensive. To overcome this issue, we employed a machine learning-based approach for the first time to develop an anticancer model using small molecules. Anticancer small molecules (ACSMs) are compounds that have been developed to target and inhibit cancer cells. In this study, we used 10,000 compounds to develop the machine learning models using five algorithms such as, Random Forest (RF), Light gradient boosting machine (LightGBM), K-nearest neighbors (KNN), Decision tree (DT) and Extreme Gradient Boosting (XGB). The developed models were evaluated using the test set and top three models were identified (RF, LightGBM and XGB). Furthermore, to validate the predictive performance of our models, we have performed external validation using an FDA approved anticancer compounds/drugs. Following this analysis, we found that our LightGBM model correctly predicted 9 compounds as active. However, RF and XGB exhibited some limitations by predicting 8 and 7 compounds as active out of 10, respectively. These results demonstrate that, when compared to RF and XGB, the LightGBM model showcase robust prediction capabilities, achieving a superior accuracy of 79% with an AUC of 0.88. These findings provide promising insights into the potential of our approach for predicting anticancer small molecules, highlighting the role of machine learning in advancing cancer treatment research.
癌症是全球第二大死亡原因。因此,在医学领域开发有效的抗癌治疗方法至关重要。与传统方法相比,抗癌肽 (ACPs) 在癌症治疗方面显示出有希望的治疗潜力。然而,通过实验手段识别 ACPs 的过程通常既费时又昂贵。为了解决这个问题,我们首次采用基于机器学习的方法,使用小分子开发抗癌模型。抗癌小分子 (ACSMs) 是为靶向和抑制癌细胞而开发的化合物。在这项研究中,我们使用了 10000 种化合物,使用五种算法(随机森林 (RF)、轻梯度提升机 (LightGBM)、K-最近邻 (KNN)、决策树 (DT) 和极端梯度提升 (XGB))开发机器学习模型。使用测试集评估开发的模型,并确定了前三个模型(RF、LightGBM 和 XGB)。此外,为了验证我们模型的预测性能,我们使用美国食品和药物管理局批准的抗癌化合物/药物进行了外部验证。通过这项分析,我们发现我们的 LightGBM 模型正确预测了 9 种化合物为活性化合物。然而,RF 和 XGB 分别预测了 8 种和 7 种化合物为活性化合物,表现出一些局限性。这些结果表明,与 RF 和 XGB 相比,LightGBM 模型展示了强大的预测能力,准确率为 79%,AUC 为 0.88。这些发现为我们的方法在预测抗癌小分子方面的潜力提供了有希望的见解,突出了机器学习在推进癌症治疗研究中的作用。