Suppr超能文献

一种基于BERT预训练模型的深度学习模型,用于预测抗癌化合物的抗增殖活性。

A deep learning model based on the BERT pre-trained model to predict the antiproliferative activity of anti-cancer chemical compounds.

作者信息

Torabi M, Haririan I, Foroumadi A, Ghanbari H, Ghasemi F

机构信息

Biosensor Research Centre, School of Advanced Technologies in Medicine, Isfahan University of Medical Sciences, Isfahan, Iran.

Department of Pharmaceutics, Faculty of Pharmacy, Tehran University of Medical Sciences, Tehran, Iran.

出版信息

SAR QSAR Environ Res. 2024 Nov;35(11):971-992. doi: 10.1080/1062936X.2024.2431486. Epub 2024 Nov 28.

Abstract

Identifying new compounds with minimal side effects to enhance patients' quality of life is the ultimate goal of drug discovery. Due to the expensive and time-consuming nature of experimental investigations and the scarcity of data in traditional QSAR studies, deep transfer learning models, such as the BERT model, have recently been suggested. This study evaluated the model's performance in predicting the anti-proliferative activity of five cancer cell lines (HeLa, MCF7, MDA-MB231, PC3, and MDA-MB) using over 3,000 synthesized molecules from PubChem. The results indicated that the model could predict the class of designed small molecules with acceptable accuracy for most cell lines, except for PC3 and MDA-MB. The model's performance was further tested on an in-house dataset of approximately 25 small molecules per cell line, based on IC50 values. The model accurately predicted the biological activity class for HeLa with an accuracy of and demonstrated acceptable performance for MCF7 and MDA-MB231, with accuracy between 0.56 and 0.66. However, the results were less reliable for PC3 and HepG2. In conclusion, the ChemBERTa fine-tuned model shows potential for predicting outcomes on in-house datasets.

摘要

识别副作用最小的新化合物以提高患者生活质量是药物研发的最终目标。由于实验研究成本高昂且耗时,以及传统定量构效关系(QSAR)研究中的数据稀缺,最近有人提出了深度迁移学习模型,如BERT模型。本研究使用来自PubChem的3000多种合成分子评估了该模型在预测五种癌细胞系(HeLa、MCF7、MDA-MB231、PC3和MDA-MB)抗增殖活性方面的性能。结果表明,除了PC3和MDA-MB外,该模型能够以可接受的准确率预测大多数细胞系中设计的小分子类别。基于半数抑制浓度(IC50)值,在每个细胞系约25个小分子的内部数据集上进一步测试了该模型的性能。该模型以 的准确率准确预测了HeLa的生物活性类别,并在MCF7和MDA-MB231上表现出可接受的性能,准确率在0.56至0.66之间。然而,对于PC3和HepG2,结果的可靠性较低。总之,经过微调的ChemBERTa模型在预测内部数据集的结果方面显示出潜力。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验