一种基于BERT预训练模型的深度学习模型，用于预测抗癌化合物的抗增殖活性。

A deep learning model based on the BERT pre-trained model to predict the antiproliferative activity of anti-cancer chemical compounds.

作者信息

Torabi M, Haririan I, Foroumadi A, Ghanbari H, Ghasemi F

机构信息

Biosensor Research Centre, School of Advanced Technologies in Medicine, Isfahan University of Medical Sciences, Isfahan, Iran.

Department of Pharmaceutics, Faculty of Pharmacy, Tehran University of Medical Sciences, Tehran, Iran.

出版信息

SAR QSAR Environ Res. 2024 Nov;35(11):971-992. doi: 10.1080/1062936X.2024.2431486. Epub 2024 Nov 28.

DOI:10.1080/1062936X.2024.2431486

PMID:39605280

Abstract

Identifying new compounds with minimal side effects to enhance patients' quality of life is the ultimate goal of drug discovery. Due to the expensive and time-consuming nature of experimental investigations and the scarcity of data in traditional QSAR studies, deep transfer learning models, such as the BERT model, have recently been suggested. This study evaluated the model's performance in predicting the anti-proliferative activity of five cancer cell lines (HeLa, MCF7, MDA-MB231, PC3, and MDA-MB) using over 3,000 synthesized molecules from PubChem. The results indicated that the model could predict the class of designed small molecules with acceptable accuracy for most cell lines, except for PC3 and MDA-MB. The model's performance was further tested on an in-house dataset of approximately 25 small molecules per cell line, based on IC50 values. The model accurately predicted the biological activity class for HeLa with an accuracy of and demonstrated acceptable performance for MCF7 and MDA-MB231, with accuracy between 0.56 and 0.66. However, the results were less reliable for PC3 and HepG2. In conclusion, the ChemBERTa fine-tuned model shows potential for predicting outcomes on in-house datasets.

摘要

识别副作用最小的新化合物以提高患者生活质量是药物研发的最终目标。由于实验研究成本高昂且耗时，以及传统定量构效关系（QSAR）研究中的数据稀缺，最近有人提出了深度迁移学习模型，如BERT模型。本研究使用来自PubChem的3000多种合成分子评估了该模型在预测五种癌细胞系（HeLa、MCF7、MDA-MB231、PC3和MDA-MB）抗增殖活性方面的性能。结果表明，除了PC3和MDA-MB外，该模型能够以可接受的准确率预测大多数细胞系中设计的小分子类别。基于半数抑制浓度（IC50）值，在每个细胞系约25个小分子的内部数据集上进一步测试了该模型的性能。该模型以的准确率准确预测了HeLa的生物活性类别，并在MCF7和MDA-MB231上表现出可接受的性能，准确率在0.56至0.66之间。然而，对于PC3和HepG2，结果的可靠性较低。总之，经过微调的ChemBERTa模型在预测内部数据集的结果方面显示出潜力。

相似文献

A deep learning model based on the BERT pre-trained model to predict the antiproliferative activity of anti-cancer chemical compounds.一种基于BERT预训练模型的深度学习模型，用于预测抗癌化合物的抗增殖活性。

SAR QSAR Environ Res. 2024 Nov;35(11):971-992. doi: 10.1080/1062936X.2024.2431486. Epub 2024 Nov 28.

Exploratory drug discovery in breast cancer patients: A multimodal deep learning approach to identify novel drug candidates targeting RTK signaling.乳腺癌患者的探索性药物发现：一种多模态深度学习方法，用于鉴定针对 RTK 信号的新型药物候选物。

Comput Biol Med. 2024 May;174:108433. doi: 10.1016/j.compbiomed.2024.108433. Epub 2024 Apr 16.

Mechanistic selectivity investigation and 2D-QSAR study of some new antiproliferative pyrazoles and pyrazolopyridines as potential CDK2 inhibitors.新型抗增殖吡唑和吡唑并吡啶作为潜在 CDK2 抑制剂的作用机制选择性研究及 2D-QSAR 研究。

Eur J Med Chem. 2021 Jun 5;218:113389. doi: 10.1016/j.ejmech.2021.113389. Epub 2021 Mar 18.

Design, synthesis, biological evaluation and structure-activity relationship of sophoridine derivatives bearing pyrrole or indole scaffold as potential antitumor agents.具有吡咯或吲哚骨架的槐定碱衍生物的设计、合成、生物评价及构效关系研究作为潜在的抗肿瘤药物。

Eur J Med Chem. 2018 Sep 5;157:665-682. doi: 10.1016/j.ejmech.2018.08.021. Epub 2018 Aug 9.

Synthesis and in vitro antiproliferative effect of novel quinoline-based potential anticancer agents.新型基于喹啉的潜在抗癌剂的合成及体外抗增殖作用。

Eur J Med Chem. 2013 May;63:826-32. doi: 10.1016/j.ejmech.2013.03.008. Epub 2013 Mar 16.

New thieno[3,2-d]pyrimidine-based derivatives: Design, synthesis and biological evaluation as antiproliferative agents, EGFR and ARO inhibitors inducing apoptosis in breast cancer cells.新型噻吩并[3,2-d]嘧啶衍生物的设计、合成与生物评价：作为抗增殖剂、EGFR 和 ARO 抑制剂诱导乳腺癌细胞凋亡。

Bioorg Chem. 2021 Oct;115:105208. doi: 10.1016/j.bioorg.2021.105208. Epub 2021 Jul 26.

Target Prediction Model for Natural Products Using Transfer Learning.基于迁移学习的天然产物靶标预测模型。

Int J Mol Sci. 2021 Apr 28;22(9):4632. doi: 10.3390/ijms22094632.

Traditional Machine and Deep Learning for Predicting Toxicity Endpoints.传统机器学习和深度学习在毒性终点预测中的应用。

Molecules. 2022 Dec 26;28(1):217. doi: 10.3390/molecules28010217.

Biological Activity, Apoptotic Induction and Cell Cycle Arrest of New Hydrazonoyl Halides Derivatives.新型酰腙卤代物的生物活性、诱导细胞凋亡和细胞周期阻滞。

Anticancer Agents Med Chem. 2019;19(9):1141-1149. doi: 10.2174/1871520619666190306123658.

Discovery of 3,6-disubstituted pyridazines as a novel class of anticancer agents targeting cyclin-dependent kinase 2: synthesis, biological evaluation and in silico insights.发现 3,6-二取代哒嗪类化合物作为新型靶向细胞周期蛋白依赖性激酶 2 的抗癌剂：合成、生物学评价和计算模拟研究。

J Enzyme Inhib Med Chem. 2020 Dec;35(1):1616-1630. doi: 10.1080/14756366.2020.1806259.

引用本文的文献

Drug repurposing to identify potential FDA-approved drugs targeting three main angiogenesis receptors through a deep learning framework.通过深度学习框架进行药物重新利用，以确定针对三种主要血管生成受体的潜在FDA批准药物。

Mol Divers. 2025 May 26. doi: 10.1007/s11030-025-11214-6.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

一种基于BERT预训练模型的深度学习模型，用于预测抗癌化合物的抗增殖活性。

A deep learning model based on the BERT pre-trained model to predict the antiproliferative activity of anti-cancer chemical compounds.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献