药物图谱与蛋白质和染色体结构网络的 IFPTML 映射与抗疟化合物发现的临床前分析信息。

IFPTML Mapping of Drug Graphs with Protein and Chromosome Structural Networks vs. Pre-Clinical Assay Information for Discovery of Antimalarial Compounds.

机构信息

Grupo RNASA-IMEDIR, Department of Computer Science, University of A Coruña, 15071 A Coruña, Spain.

Research Department, Puyo Campus, Universidad Estatal Amazónica, Puyo 160150, Ecuador.

出版信息

Int J Mol Sci. 2021 Dec 2;22(23):13066. doi: 10.3390/ijms222313066.

DOI:10.3390/ijms222313066

PMID:34884870

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8657696/

Abstract

The parasite species of genus causes Malaria, which remains a major global health problem due to parasite resistance to available Antimalarial drugs and increasing treatment costs. Consequently, computational prediction of new Antimalarial compounds with novel targets in the proteome of sp. is a very important goal for the pharmaceutical industry. We can expect that the success of the pre-clinical assay depends on the conditions of assay per se, the chemical structure of the drug, the structure of the target protein to be targeted, as well as on factors governing the expression of this protein in the proteome such as genes (Deoxyribonucleic acid, DNA) sequence and/or chromosomes structure. However, there are no reports of computational models that consider all these factors simultaneously. Some of the difficulties for this kind of analysis are the dispersion of data in different datasets, the high heterogeneity of data, etc. In this work, we analyzed three databases ChEMBL (Chemical database of the European Molecular Biology Laboratory), UniProt (Universal Protein Resource), and NCBI-GDV (National Center for Biotechnology Information-Genome Data Viewer) to achieve this goal. The ChEMBL dataset contains outcomes for 17,758 unique assays of potential Antimalarial compounds including numeric descriptors (variables) for the structure of compounds as well as a huge amount of information about the conditions of assays. The NCBI-GDV and UniProt datasets include the sequence of genes, proteins, and their functions. In addition, we also created two partitions (c = c and c = cd) of categorical variables from theChEMBL dataset. These partitions contain variables that encode information about experimental conditions of preclinical assays (c) or about the nature and quality of data (c). These categorical variables include information about 22 parameters of biological activity (c), 28 target proteins (c), and 9 organisms of assay (c), etc. We also created another partition of (c = c) including categorical variables with biological information about the target proteins, genes, and chromosomes. These variables cover32 genes (c), 10 chromosomes (c), gene orientation (c), and 31 protein functions (c). We used a Perturbation-Theory Machine Learning Information Fusion (IFPTML) algorithm to map all this information (from three databases) into and train a predictive model. Shannon's entropy measure Sh (numerical variables) was used to quantify the information about the structure of drugs, protein sequences, gene sequences, and chromosomes in the same information scale. Perturbation Theory Operators (PTOs) with the form of Moving Average (MA) operators have been used to quantify perturbations (deviations) in the structural variables with respect to their expected values for different subsets (partitions) of categorical variables. We obtained three IFPTML models using General Discriminant Analysis (GDA), Classification Tree with Univariate Splits (CTUS), and Classification Tree with Linear Combinations (CTLC). The IFPTML-CTLC presented the better performance with Sensitivity Sn(%) = 83.6/85.1, and Specificity Sp(%) = 89.8/89.7 for training/validation sets, respectively. This model could become a useful tool for the optimization of preclinical assays of new Antimalarial compounds vs. different proteins in the proteome of .

摘要

疟原虫属的寄生虫种引起疟疾，由于寄生虫对现有抗疟药物的耐药性以及治疗成本的增加，疟疾仍然是一个主要的全球健康问题。因此，在 sp. 的蛋白质组中寻找新的抗疟化合物靶标并进行计算预测是制药行业的一个非常重要的目标。我们可以预期，临床前检测的成功将取决于检测本身的条件、药物的化学结构、目标蛋白质的结构，以及控制这些蛋白质在蛋白质组中表达的因素，如基因（脱氧核糖核酸，DNA）序列和/或染色体结构。然而，目前还没有报告表明计算模型同时考虑了所有这些因素。这种分析的一些困难在于数据在不同数据集之间的分散，数据的高度异质性等。在这项工作中，我们分析了三个数据库 ChEMBL（欧洲分子生物学实验室的化学数据库）、UniProt（通用蛋白质资源）和 NCBI-GDV（国家生物技术信息中心-基因组数据查看器）来实现这一目标。ChEMBL 数据集包含了 17758 种潜在抗疟化合物的 17758 个独特检测结果，包括化合物结构的数值描述符（变量）以及大量关于检测条件的信息。NCBI-GDV 和 UniProt 数据集包括基因、蛋白质及其功能的序列。此外，我们还从 ChEMBL 数据集创建了两个分类变量（c = c 和 c = cd）分区。这些分区包含编码临床前检测实验条件（c）或数据性质和质量（c）信息的变量。这些分类变量包括 22 个生物学活性参数（c）、28 个目标蛋白（c）和 9 个检测生物（c）等信息。我们还创建了另一个（c = c）分区，其中包含关于目标蛋白、基因和染色体的生物学信息的分类变量。这些变量涵盖了 32 个基因（c）、10 个染色体（c）、基因取向（c）和 31 个蛋白质功能（c）。我们使用了扰动理论机器学习信息融合（IFPTML）算法将所有这些信息（来自三个数据库）映射并训练预测模型。Shannon 熵测度 Sh（数值变量）用于量化药物、蛋白质序列、基因序列和染色体结构信息在同一信息尺度上的信息。使用形式为移动平均（MA）算子的扰动理论算子（PTO）来量化不同分类变量子集（分区）中结构变量的扰动（偏差）。我们使用广义判别分析（GDA）、单变量分裂分类树（CTUS）和线性组合分类树（CTLC）获得了三个 IFPTML 模型。IFPTML-CTLC 的性能更好，训练集的灵敏度 Sn（%）=83.6/85.1，验证集的特异性 Sp（%）=89.8/89.7。该模型可以成为优化针对疟原虫属蛋白质组中不同蛋白质的新型抗疟化合物临床前检测的有用工具。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8699/8657696/c83b49dc984a/ijms-22-13066-g001.jpg

相似文献

IFPTML Mapping of Drug Graphs with Protein and Chromosome Structural Networks vs. Pre-Clinical Assay Information for Discovery of Antimalarial Compounds.

Int J Mol Sci. 2021 Dec 2;22(23):13066. doi: 10.3390/ijms222313066.

Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.

Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.

CAPi: Computational Model for Apicoplast Inhibitors Prediction Against Plasmodium Parasite.

Curr Comput Aided Drug Des. 2017 Nov 10;13(4):303-310. doi: 10.2174/1573409913666170301121110.

PTML Combinatorial Model of ChEMBL Compounds Assays for Multiple Types of Cancer.

ACS Comb Sci. 2018 Nov 12;20(11):621-632. doi: 10.1021/acscombsci.8b00090. Epub 2018 Oct 3.

Chromosome Gene Orientation Inversion Networks (GOINs) of Plasmodium Proteome.

J Proteome Res. 2018 Mar 2;17(3):1258-1268. doi: 10.1021/acs.jproteome.7b00861. Epub 2018 Feb 5.

PTML Model for Selection of Nanoparticles, Anticancer Drugs, and Vitamins in the Design of Drug-Vitamin Nanoparticle Release Systems for Cancer Cotherapy.

Mol Pharm. 2020 Jul 6;17(7):2612-2627. doi: 10.1021/acs.molpharmaceut.0c00308. Epub 2020 Jun 8.

Machine Learning Study of Metabolic Networks ChEMBL Data of Antibacterial Compounds.

Mol Pharm. 2022 Jul 4;19(7):2151-2163. doi: 10.1021/acs.molpharmaceut.2c00029. Epub 2022 Jun 7.

Ligand-based virtual screening and in silico design of new antimalarial compounds using nonstochastic and stochastic total and atom-type quadratic maps.

J Chem Inf Model. 2005 Jul-Aug;45(4):1082-100. doi: 10.1021/ci050085t.

Multioutput Perturbation-Theory Machine Learning (PTML) Model of ChEMBL Data for Antiretroviral Compounds.

Mol Pharm. 2019 Oct 7;16(10):4200-4212. doi: 10.1021/acs.molpharmaceut.9b00538. Epub 2019 Aug 30.

NL MIND-BEST: a web server for ligands and proteins discovery--theoretic-experimental study of proteins of Giardia lamblia and new compounds active against Plasmodium falciparum.

J Theor Biol. 2011 May 7;276(1):229-49. doi: 10.1016/j.jtbi.2011.01.010. Epub 2011 Jan 26.

引用本文的文献

Perturbation-Theory Machine Learning for Multi-Target Drug Discovery in Modern Anticancer Research.

Curr Issues Mol Biol. 2025 Apr 25;47(5):301. doi: 10.3390/cimb47050301.

Artificial Intelligence-Driven Modeling for Hydrogel Three-Dimensional Printing: Computational and Experimental Cases of Study.

Polymers (Basel). 2025 Jan 6;17(1):121. doi: 10.3390/polym17010121.

Machine learning guided prediction of warfarin blood levels for personalized medicine based on clinical longitudinal data from cardiac surgery patients: a prospective observational study.

Int J Surg. 2024 Oct 1;110(10):6528-6540. doi: 10.1097/JS9.0000000000001734.

From molecular mechanisms of prostate cancer to translational applications: based on multi-omics fusion analysis and intelligent medicine.

Health Inf Sci Syst. 2023 Dec 18;12(1):6. doi: 10.1007/s13755-023-00264-5. eCollection 2024 Dec.

本文引用的文献

Malaria parasites fine-tune mutations to resist drugs.

Nature. 2019 Dec;576(7786):217-219. doi: 10.1038/d41586-019-03587-0.

Widespread resistance mutations to sulfadoxine-pyrimethamine in malaria parasites imported to China from Central and Western Africa.

Int J Parasitol Drugs Drug Resist. 2020 Apr;12:1-6. doi: 10.1016/j.ijpddr.2019.11.002. Epub 2019 Nov 29.

Designing nanoparticle release systems for drug-vitamin cancer co-therapy with multiplicative perturbation-theory machine learning (PTML) models.

Nanoscale. 2019 Nov 21;11(45):21811-21823. doi: 10.1039/c9nr05070a.

: A Novel Multiplatform Framework to Compute Tensor Algebra-Based Three-Dimensional Protein Descriptors.

J Chem Inf Model. 2020 Feb 24;60(2):1042-1059. doi: 10.1021/acs.jcim.9b00629. Epub 2019 Oct 30.

When global and local molecular descriptors are more than the sum of its parts: Simple, But Not Simpler?

Mol Divers. 2020 Nov;24(4):913-932. doi: 10.1007/s11030-019-10002-3. Epub 2019 Oct 28.

Modeling Antibacterial Activity with Machine Learning and Fusion of Chemical Structure Information with Microorganism Metabolic Networks.

J Chem Inf Model. 2019 Mar 25;59(3):1109-1120. doi: 10.1021/acs.jcim.9b00034. Epub 2019 Mar 4.

Identifying Structure-Property Relationships through SMILES Syntax Analysis with Self-Attention Mechanism.

J Chem Inf Model. 2019 Feb 25;59(2):914-923. doi: 10.1021/acs.jcim.8b00803. Epub 2019 Feb 6.

Has doxycycline, in combination with anti-malarial drugs, a role to play in intermittent preventive treatment of Plasmodium falciparum malaria infection in pregnant women in Africa?

Malar J. 2018 Dec 14;17(1):469. doi: 10.1186/s12936-018-2621-x.

De Novo Molecule Design by Translating from Reduced Graphs to SMILES.

J Chem Inf Model. 2019 Mar 25;59(3):1136-1146. doi: 10.1021/acs.jcim.8b00626. Epub 2018 Dec 21.

ChEMBL: towards direct deposition of bioassay data.

Nucleic Acids Res. 2019 Jan 8;47(D1):D930-D940. doi: 10.1093/nar/gky1075.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

药物图谱与蛋白质和染色体结构网络的 IFPTML 映射与抗疟化合物发现的临床前分析信息。

IFPTML Mapping of Drug Graphs with Protein and Chromosome Structural Networks vs. Pre-Clinical Assay Information for Discovery of Antimalarial Compounds.

机构信息

Grupo RNASA-IMEDIR, Department of Computer Science, University of A Coruña, 15071 A Coruña, Spain.

Research Department, Puyo Campus, Universidad Estatal Amazónica, Puyo 160150, Ecuador.

出版信息

Int J Mol Sci. 2021 Dec 2;22(23):13066. doi: 10.3390/ijms222313066.

DOI:10.3390/ijms222313066

PMID:34884870

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8657696/

Abstract

摘要

药物图谱与蛋白质和染色体结构网络的 IFPTML 映射与抗疟化合物发现的临床前分析信息。

IFPTML Mapping of Drug Graphs with Protein and Chromosome Structural Networks vs. Pre-Clinical Assay Information for Discovery of Antimalarial Compounds.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

药物图谱与蛋白质和染色体结构网络的 IFPTML 映射与抗疟化合物发现的临床前分析信息。

IFPTML Mapping of Drug Graphs with Protein and Chromosome Structural Networks vs. Pre-Clinical Assay Information for Discovery of Antimalarial Compounds.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献