PTML 组合模型分析多个类型癌症的 ChEMBL 化合物检测结果。

PTML Combinatorial Model of ChEMBL Compounds Assays for Multiple Types of Cancer.

机构信息

Department of Organic Chemistry II , University of Basque Country UPV/EHU , 48940 , Leioa , Spain.

IKERBASQUE, Basque Foundation for Science , 48011 , Bilbao , Spain.

出版信息

ACS Comb Sci. 2018 Nov 12;20(11):621-632. doi: 10.1021/acscombsci.8b00090. Epub 2018 Oct 3.

DOI:10.1021/acscombsci.8b00090

PMID:30240186

Abstract

Determining the target proteins of new anticancer compounds is a very important task in Medicinal Chemistry. In this sense, chemists carry out preclinical assays with a high number of combinations of experimental conditions (c ). In fact, ChEMBL database contains outcomes of 65 534 different anticancer activity preclinical assays for 35 565 different chemical compounds (1.84 assays per compound). These assays cover different combinations of c formed from >70 different biological activity parameters ( c), >300 different drug targets ( c), >230 cell lines ( c), and 5 organisms of assay ( c) or organisms of the target ( c). It include a total of 45 833 assays in leukemia, 6227 assays in breast cancer, 2499 assays in ovarian cancer, 3499 in colon cancer, 3159 in lung cancer, 2750 in prostate cancer, 601 in melanoma, etc. This is a very complex data set with multiple Big Data features. This data is hard to be rationalized by researchers to extract useful relationships and predict new compounds. In this context, we propose to combine perturbation theory (PT) ideas and machine learning (ML) modeling to solve this combinatorial-like problem. In this work, we report a PTML (PT + ML) model for ChEMBL data set of preclinical assays of anticancer compounds. This is a simple linear model with only three variables. The model presented values of area under receiver operating curve = AUROC = 0.872, specificity = Sp(%) = 90.2, sensitivity = Sn(%) = 70.6, and overall accuracy = Ac(%) = 87.7 in training series. The model also have Sp(%) = 90.1, Sn(%) = 71.4, and Ac(%) = 87.8 in external validation series. The model use PT operators based on multicondition moving averages to capture all the complexity of the data set. We also compared the model with nonlinear artificial neural network (ANN) models obtaining similar results. This confirms the hypothesis of a linear relationship between the PT operators and the classification as anticancer compounds in different combinations of assay conditions. Last, we compared the model with other PTML models reported in the literature concluding that this is the only one PTML model able to predict activity against multiple types of cancer. This model is a simple but versatile tool for the prediction of the targets of anticancer compounds taking into consideration multiple combinations of experimental conditions in preclinical assays.

摘要

确定新型抗癌化合物的靶蛋白是药物化学中非常重要的任务。在这方面，化学家进行了大量的预临床实验，实验条件组合数量众多（c）。事实上，ChEMBL 数据库包含了针对 35565 种不同化学化合物的 65534 种不同抗癌活性预临床实验的结果（每种化合物 1.84 个实验）。这些实验涵盖了由>70 种不同生物活性参数（c）、>300 种不同药物靶点（c）、>230 种细胞系（c）和 5 种实验生物（c）或目标生物（c）组成的不同组合。它总共包括 45833 种白血病实验、6227 种乳腺癌实验、2499 种卵巢癌实验、3499 种结肠癌实验、3159 种肺癌实验、2750 种前列腺癌实验、601 种黑色素瘤实验等。这是一个非常复杂的数据集，具有多个大数据特征。研究人员很难将这些数据合理化，以提取有用的关系并预测新的化合物。在这种情况下，我们提出将扰动理论（PT）思想和机器学习（ML）建模相结合，以解决这种组合问题。在这项工作中，我们报告了一个用于 ChEMBL 抗癌化合物预临床实验数据集的 PTML（PT+ML）模型。这是一个简单的线性模型，只有三个变量。该模型的接收器操作曲线下面积（AUROC）值为 0.872，特异性（Sp%）为 90.2，敏感性（Sn%）为 70.6，整体准确率（Ac%）为 87.7，在训练系列中。该模型在外部验证系列中的特异性（Sp%）为 90.1、敏感性（Sn%）为 71.4 和准确率（Ac%）为 87.8。该模型使用基于多条件移动平均值的 PT 算子来捕捉数据集的所有复杂性。我们还将该模型与非线性人工神经网络（ANN）模型进行了比较，得到了类似的结果。这证实了 PT 算子与作为不同实验条件组合的抗癌化合物的分类之间存在线性关系的假设。最后，我们将该模型与文献中报道的其他 PTML 模型进行了比较，得出的结论是，这是唯一能够预测对多种类型癌症具有活性的 PTML 模型。该模型是一种简单但多功能的工具，可用于预测抗癌化合物的靶标，同时考虑预临床实验中多种实验条件的组合。

相似文献

PTML Combinatorial Model of ChEMBL Compounds Assays for Multiple Types of Cancer.

ACS Comb Sci. 2018 Nov 12;20(11):621-632. doi: 10.1021/acscombsci.8b00090. Epub 2018 Oct 3.

Multioutput Perturbation-Theory Machine Learning (PTML) Model of ChEMBL Data for Antiretroviral Compounds.

Mol Pharm. 2019 Oct 7;16(10):4200-4212. doi: 10.1021/acs.molpharmaceut.9b00538. Epub 2019 Aug 30.

PTML Model for Selection of Nanoparticles, Anticancer Drugs, and Vitamins in the Design of Drug-Vitamin Nanoparticle Release Systems for Cancer Cotherapy.

Mol Pharm. 2020 Jul 6;17(7):2612-2627. doi: 10.1021/acs.molpharmaceut.0c00308. Epub 2020 Jun 8.

Perturbation Theory/Machine Learning Model of ChEMBL Data for Dopamine Targets: Docking, Synthesis, and Assay of New l-Prolyl-l-leucyl-glycinamide Peptidomimetics.

ACS Chem Neurosci. 2018 Nov 21;9(11):2572-2587. doi: 10.1021/acschemneuro.8b00083. Epub 2018 Jun 25.

PTML Model of ChEMBL Compounds Assays for Vitamin Derivatives.

ACS Comb Sci. 2020 Mar 9;22(3):129-141. doi: 10.1021/acscombsci.9b00166. Epub 2020 Feb 13.

Big Data Challenges Targeting Proteins in GPCR Signaling Pathways; Combining PTML-ChEMBL Models and [S]GTPγS Binding Assays.

ACS Chem Neurosci. 2019 Nov 20;10(11):4476-4491. doi: 10.1021/acschemneuro.9b00302. Epub 2019 Nov 4.

Designing nanoparticle release systems for drug-vitamin cancer co-therapy with multiplicative perturbation-theory machine learning (PTML) models.

Nanoscale. 2019 Nov 21;11(45):21811-21823. doi: 10.1039/c9nr05070a.

Perturbation-Theory and Machine Learning (PTML) Model for High-Throughput Screening of Parham Reactions: Experimental and Theoretical Studies.

J Chem Inf Model. 2018 Jul 23;58(7):1384-1396. doi: 10.1021/acs.jcim.8b00286. Epub 2018 Jun 27.

Perturbation-Theory Machine Learning (PTML) Multilabel Model of the ChEMBL Dataset of Preclinical Assays for Antisarcoma Compounds.

ACS Omega. 2020 Oct 15;5(42):27211-27220. doi: 10.1021/acsomega.0c03356. eCollection 2020 Oct 27.

IFPTML Mapping of Drug Graphs with Protein and Chromosome Structural Networks vs. Pre-Clinical Assay Information for Discovery of Antimalarial Compounds.

Int J Mol Sci. 2021 Dec 2;22(23):13066. doi: 10.3390/ijms222313066.

引用本文的文献

PTML models of self assembled ligand free nanoparticle catalysts for cross coupling reactions.

Sci Rep. 2025 Aug 14;15(1):29895. doi: 10.1038/s41598-025-14080-2.

In Silico Approach for Early Antimalarial Drug Discovery: De Novo Design of Virtual Multi-Strain Antiplasmodial Inhibitors.

Microorganisms. 2025 Jul 9;13(7):1620. doi: 10.3390/microorganisms13071620.

Perturbation-Theory Machine Learning for Multi-Target Drug Discovery in Modern Anticancer Research.

Curr Issues Mol Biol. 2025 Apr 25;47(5):301. doi: 10.3390/cimb47050301.

Modeling and Interpretability Study of the Structure-Activity Relationship for Multigeneration EGFR Inhibitors.

ACS Omega. 2025 Mar 14;10(11):11176-11187. doi: 10.1021/acsomega.4c10464. eCollection 2025 Mar 25.

In Silico Approach for Antibacterial Discovery: PTML Modeling of Virtual Multi-Strain Inhibitors Against .

Pharmaceuticals (Basel). 2025 Jan 31;18(2):196. doi: 10.3390/ph18020196.

Perturbation-theory machine learning for mood disorders: virtual design of dual inhibitors of NET and SERT proteins.

BMC Chem. 2025 Jan 2;19(1):2. doi: 10.1186/s13065-024-01376-z.

Support Vector Machine-Based Prediction Models for Drug Repurposing and Designing Novel Drugs for Colorectal Cancer.

ACS Omega. 2024 Apr 9;9(16):18584-18592. doi: 10.1021/acsomega.4c01195. eCollection 2024 Apr 23.

Implementation of IFPTML Computational Models in Drug Discovery Against Flaviviridae Family.

J Chem Inf Model. 2024 Mar 25;64(6):1841-1852. doi: 10.1021/acs.jcim.3c01796. Epub 2024 Mar 11.

COIMMR: a computational framework to reveal the contribution of herbal ingredients against human cancer via immune microenvironment and metabolic reprogramming.

Brief Bioinform. 2023 Sep 22;24(6). doi: 10.1093/bib/bbad346.

Machine Learning Study of Metabolic Networks ChEMBL Data of Antibacterial Compounds.

Mol Pharm. 2022 Jul 4;19(7):2151-2163. doi: 10.1021/acs.molpharmaceut.2c00029. Epub 2022 Jun 7.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

PTML 组合模型分析多个类型癌症的 ChEMBL 化合物检测结果。

PTML Combinatorial Model of ChEMBL Compounds Assays for Multiple Types of Cancer.

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献