Computer Science Faculty, University of A Coruna, 15071 A Coruña, Spain.
Centro de Investigación en Tecnologías de la Información y Las Comunicaciones (CITIC), Campus de Elviña s/n, 15071 A Coruña, Spain.
Int J Mol Sci. 2021 Oct 26;22(21):11519. doi: 10.3390/ijms222111519.
The theoretical prediction of drug-decorated nanoparticles (DDNPs) has become a very important task in medical applications. For the current paper, Perturbation Theory Machine Learning (PTML) models were built to predict the probability of different pairs of drugs and nanoparticles creating DDNP complexes with anti-glioblastoma activity. PTML models use the perturbations of molecular descriptors of drugs and nanoparticles as inputs in experimental conditions. The raw dataset was obtained by mixing the nanoparticle experimental data with drug assays from the ChEMBL database. Ten types of machine learning methods have been tested. Only 41 features have been selected for 855,129 drug-nanoparticle complexes. The best model was obtained with the Bagging classifier, an ensemble meta-estimator based on 20 decision trees, with an area under the receiver operating characteristic curve (AUROC) of 0.96, and an accuracy of 87% (test subset). This model could be useful for the virtual screening of nanoparticle-drug complexes in glioblastoma. All the calculations can be reproduced with the datasets and python scripts, which are freely available as a GitHub repository from authors.
药物修饰纳米粒子(DDNP)的理论预测已成为医学应用中非常重要的任务。在当前的论文中,构建了摄动理论机器学习(PTML)模型,以预测不同药物和纳米粒子对产生具有抗神经胶质瘤活性的 DDNP 复合物的概率。PTML 模型将药物和纳米粒子的分子描述符的摄动作为实验条件下的输入。原始数据集是通过将纳米粒子实验数据与 ChEMBL 数据库中的药物测定值混合获得的。已经测试了十种机器学习方法。对于 855129 个药物-纳米粒子复合物,仅选择了 41 种特征。最佳模型是使用基于 20 个决策树的集成元估计器 Bagging 分类器获得的,接收器操作特性曲线(AUROC)下的面积为 0.96,准确性为 87%(测试子集)。该模型可用于神经胶质瘤中纳米粒子-药物复合物的虚拟筛选。所有计算都可以使用数据集和 python 脚本进行重现,这些数据集和 python 脚本可以作为作者的 GitHub 存储库免费获得。