Suppr超能文献

用于预测新型抗疟药物的机器学习模型的开发与实验验证

Development and experimental validation of a machine learning model for the prediction of new antimalarials.

作者信息

Kore Mukul, Acharya Dimple, Sharma Lakshya, Vembar Shruthi Sridhar, Sundriyal Sandeep

机构信息

Department of Pharmacy, Birla Institute of Technology and Science Pilani, Pilani Campus, Vidya Vihar, Pilani, Rajasthan, 333 031, India.

Institute of Bioinformatics and Applied Biotechnology, Electronics City Phase I, Helix Biotech Park, Bengaluru, Karnataka, 560100, India.

出版信息

BMC Chem. 2025 Jan 30;19(1):28. doi: 10.1186/s13065-025-01395-4.

Abstract

A large set of antimalarial molecules (N ~ 15k) was employed from ChEMBL to build a robust random forest (RF) model for the prediction of antiplasmodial activity. Rather than depending on high throughput screening (HTS) data, molecules tested at multiple doses against blood stages of Plasmodium falciparum were used for model development. The open-access and code-free KNIME platform was used to develop a workflow to train the model on 80% of data (N ~ 12k). The hyperparameter values were optimized to achieve the highest predictive accuracy with nine different molecular fingerprints (MFPs), among which Avalon MFPs (referred to as RF-1) provided the best results. RF-1 displayed 91.7% accuracy, 93.5% precision, 88.4% sensitivity and 97.3% area under the Receiver operating characteristic (AUROC) for the remaining 20% test set. The predictive performance of RF-1 was comparable to that of the malaria inhibitor prediction platform (MAIP), a recently reported consensus model based on a large proprietary dataset. However, hits obtained from RF-1 and MAIP from a commercial library did not overlap, suggesting that these two models are complementary. Finally, RF-1 was used to screen small molecules under clinical investigations for repurposing. Six molecules were purchased, out of which two human kinase inhibitors were identified to have single-digit micromolar antiplasmodial activity. One of the hits (compound 1) was a potent inhibitor of β-hematin, suggesting the involvement of parasite hemozoin (Hz) synthesis in the parasiticidal effect. The training and test sets are provided as supplementary information, allowing others to reproduce this work.

摘要

从ChEMBL中选取了一大组抗疟分子(N≈15000),用于构建一个强大的随机森林(RF)模型,以预测抗疟原虫活性。该模型开发并非依赖于高通量筛选(HTS)数据,而是使用针对恶性疟原虫血液阶段在多个剂量下进行测试的分子。利用开放获取且无需代码的KNIME平台开发了一个工作流程,在80%的数据(N≈12000)上训练模型。通过九种不同的分子指纹(MFPs)对超参数值进行了优化,以实现最高的预测准确性,其中Avalon MFPs(称为RF - 1)提供了最佳结果。对于其余20%的测试集,RF - 1的准确率为91.7%,精确率为93.5%,灵敏度为88.4%,受试者工作特征曲线下面积(AUROC)为97.3%。RF - 1的预测性能与疟疾抑制剂预测平台(MAIP)相当,MAIP是最近基于一个大型专有数据集报告的共识模型。然而,从商业库中通过RF - 1和MAIP获得的命中化合物并不重叠,这表明这两个模型具有互补性。最后,RF - 1被用于筛选处于临床研究阶段的小分子以进行重新利用。购买了六种分子,其中两种人类激酶抑制剂被鉴定具有个位数微摩尔的抗疟原虫活性。其中一个命中化合物(化合物1)是β - 血红素的强效抑制剂,表明寄生虫疟色素(Hz)合成参与了杀虫作用。训练集和测试集作为补充信息提供,以便其他人能够重现这项工作。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/76da/11783816/7ca57e4458a4/13065_2025_1395_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验