Suppr超能文献

用于抗肉瘤化合物临床前试验的ChEMBL数据集的微扰理论机器学习(PTML)多标签模型

Perturbation-Theory Machine Learning (PTML) Multilabel Model of the ChEMBL Dataset of Preclinical Assays for Antisarcoma Compounds.

作者信息

Cabrera-Andrade Alejandro, López-Cortés Andrés, Munteanu Cristian R, Pazos Alejandro, Pérez-Castillo Yunierkis, Tejera Eduardo, Arrasate Sonia, González-Díaz Humbert

机构信息

Grupo de Bio-Quimioinformática, Universidad de Las Américas, de los Granados Avenue, Quito 170125, Ecuador.

Carrera de Enfermería, Facultad de Ciencias de la Salud, Universidad de Las Américas, de los Granados Avenue, Quito 170125, Ecuador.

出版信息

ACS Omega. 2020 Oct 15;5(42):27211-27220. doi: 10.1021/acsomega.0c03356. eCollection 2020 Oct 27.

Abstract

Sarcomas are a group of malignant neoplasms of connective tissue with a different etiology than carcinomas. The efforts to discover new drugs with antisarcoma activity have generated large datasets of multiple preclinical assays with different experimental conditions. For instance, the ChEMBL database contains outcomes of 37,919 different antisarcoma assays with 34,955 different chemical compounds. Furthermore, the experimental conditions reported in this dataset include 157 types of biological activity parameters, 36 drug targets, 43 cell lines, and 17 assay organisms. Considering this information, we propose combining perturbation theory (PT) principles with machine learning (ML) to develop a PTML model to predict antisarcoma compounds. PTML models use one function of reference that measures the probability of a drug being active under certain conditions (protein, cell line, organism, .). In this paper, we used a linear discriminant analysis and neural network to train and compare PT and non-PT models. All the explored models have an accuracy of 89.19-95.25% for training and 89.22-95.46% in validation sets. PTML-based strategies have similar accuracy but generate simplest models. Therefore, they may become a versatile tool for predicting antisarcoma compounds.

摘要

肉瘤是一组结缔组织恶性肿瘤,其病因与癌不同。发现具有抗肉瘤活性的新药的努力产生了大量不同实验条件下的多种临床前试验数据集。例如,ChEMBL数据库包含37919种不同抗肉瘤试验对34955种不同化合物的结果。此外,该数据集中报告的实验条件包括157种生物活性参数、36种药物靶点、43种细胞系和17种试验生物体。考虑到这些信息,我们建议将微扰理论(PT)原理与机器学习(ML)相结合,以开发一种PTML模型来预测抗肉瘤化合物。PTML模型使用一个参考函数来衡量药物在特定条件下(蛋白质、细胞系、生物体等)具有活性的概率。在本文中,我们使用线性判别分析和神经网络来训练和比较PT模型和非PT模型。所有探索的模型在训练集上的准确率为89.19 - 95.25%,在验证集上的准确率为89.22 - 95.46%。基于PTML的策略具有相似的准确率,但生成的模型最简单。因此,它们可能成为预测抗肉瘤化合物的通用工具。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/553a/7594149/47c8ef80dfdd/ao0c03356_0002.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验