基于无监督学习的类药性质评分

Drug-likeness scoring based on unsupervised learning.

作者信息

Lee Kyunghoon, Jang Jinho, Seo Seonghwan, Lim Jaechang, Kim Woo Youn

机构信息

Department of Chemistry, KAIST 291 Daehak-ro, Yuseong-gu Daejeon 34 141 Republic of Korea

HITS Incorporation 124 Teheran-ro, Gangnam-gu Seoul 06 234 Republic of Korea

出版信息

Chem Sci. 2021 Dec 14;13(2):554-565. doi: 10.1039/d1sc05248a. eCollection 2022 Jan 5.

DOI:10.1039/d1sc05248a

PMID:35126987

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8729801/

Abstract

Drug-likeness prediction is important for the virtual screening of drug candidates. It is challenging because the drug-likeness is presumably associated with the whole set of necessary properties to pass through clinical trials, and thus no definite data for regression is available. Recently, binary classification models based on graph neural networks have been proposed but with strong dependency of their performances on the choice of the negative set for training. Here we propose a novel unsupervised learning model that requires only known drugs for training. We adopted a language model based on a recurrent neural network for unsupervised learning. It showed relatively consistent performance across different datasets, unlike such classification models. In addition, the unsupervised learning model provides drug-likeness scores that well separate distributions with increasing mean values in the order of datasets composed of molecules at a later step in a drug development process, whereas the classification model predicted a polarized distribution with two extreme values for all datasets presumably due to the overconfident prediction for unseen data. Thus, this new concept offers a pragmatic tool for drug-likeness scoring and further can be applied to other biochemical applications.

摘要

药物相似性预测对于药物候选物的虚拟筛选很重要。这具有挑战性，因为药物相似性大概与通过临床试验所需的全套性质相关联，因此没有用于回归的确切数据。最近，基于图神经网络的二元分类模型已被提出，但它们的性能强烈依赖于训练负样本集的选择。在此，我们提出了一种新颖的无监督学习模型，该模型仅需要已知药物进行训练。我们采用了基于循环神经网络的语言模型进行无监督学习。与这类分类模型不同，它在不同数据集上表现出相对一致的性能。此外，无监督学习模型提供的药物相似性分数能够很好地分离不同分布，这些分布随着由处于药物开发过程后期的分子组成的数据集的均值增加而有序排列，而分类模型大概由于对未见数据的过度自信预测，对所有数据集都预测出具有两个极值的极化分布。因此，这个新概念为药物相似性评分提供了一个实用工具，并且进一步可应用于其他生化应用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fd6e/8729801/61a379dd9c0e/d1sc05248a-f1.jpg

相似文献

Drug-likeness scoring based on unsupervised learning.基于无监督学习的类药性质评分

Chem Sci. 2021 Dec 14;13(2):554-565. doi: 10.1039/d1sc05248a. eCollection 2022 Jan 5.

DrugMetric: quantitative drug-likeness scoring based on chemical space distance.DrugMetric：基于化学空间距离的定量类药性评分。

Brief Bioinform. 2024 May 23;25(4). doi: 10.1093/bib/bbae321.

miDruglikeness: Subdivisional Drug-Likeness Prediction Models Using Active Ensemble Learning Strategies.miDruglikeness：基于主动集成学习策略的细分药物相似性预测模型。

Biomolecules. 2022 Dec 23;13(1):29. doi: 10.3390/biom13010029.

Prediction of Drug-Likeness Using Deep Autoencoder Neural Networks.使用深度自动编码器神经网络预测类药性

Front Genet. 2018 Nov 27;9:585. doi: 10.3389/fgene.2018.00585. eCollection 2018.

Drug-likeness analysis of traditional Chinese medicines: prediction of drug-likeness using machine learning approaches.中药类药性分析：基于机器学习方法的类药性预测。

Mol Pharm. 2012 Oct 1;9(10):2875-86. doi: 10.1021/mp300198d. Epub 2012 Sep 20.

Prediction of drug-likeness using graph convolutional attention network.使用图卷积注意力网络预测类药性

Bioinformatics. 2022 Nov 30;38(23):5262-5269. doi: 10.1093/bioinformatics/btac676.

Graph Convolutional Capsule Regression (GCCR): A Model for Accelerated Filtering of Novel Potential Candidates for SARS-CoV-2 based on Binding Affinity.图卷积胶囊回归（GCCR）：一种基于结合亲和力加速筛选新型冠状病毒潜在候选物的模型。

Curr Comput Aided Drug Des. 2024;20(1):33-41. doi: 10.2174/1573409919666230331083953.

DBPP-Predictor: a novel strategy for prediction of chemical drug-likeness based on property profiles.DBPP预测器：一种基于性质概况预测化学药物相似性的新策略。

J Cheminform. 2024 Jan 5;16(1):4. doi: 10.1186/s13321-024-00800-9.

Molecule generation toward target protein (SARS-CoV-2) using reinforcement learning-based graph neural network via knowledge graph.通过知识图谱，利用基于强化学习的图神经网络生成针对目标蛋白（SARS-CoV-2）的分子。

Netw Model Anal Health Inform Bioinform. 2023;12(1):13. doi: 10.1007/s13721-023-00409-2. Epub 2023 Jan 6.

Graph Convolution Networks with manifold regularization for semi-supervised learning.图卷积网络与流形正则化的半监督学习。

Neural Netw. 2020 Jul;127:160-167. doi: 10.1016/j.neunet.2020.04.016. Epub 2020 Apr 23.

引用本文的文献

ADME-drug-likeness: enriching molecular foundation models via pharmacokinetics-guided multi-task learning for drug-likeness prediction.药物代谢动力学、分布、代谢和排泄（ADME）性质与药物相似性：通过药代动力学引导的多任务学习丰富分子基础模型以进行药物相似性预测。

Bioinformatics. 2025 Jul 1;41(Supplement_1):i352-i361. doi: 10.1093/bioinformatics/btaf259.

Artif Intell Chem. 2024 Dec;2(2). doi: 10.1016/j.aichem.2024.100077. Epub 2024 Aug 31.

Artificial Intelligence in Natural Product Drug Discovery: Current Applications and Future Perspectives.天然产物药物发现中的人工智能：当前应用与未来展望。

J Med Chem. 2025 Feb 27;68(4):3948-3969. doi: 10.1021/acs.jmedchem.4c01257. Epub 2025 Feb 6.

Probing structural requirements for thiazole-based mimetics of sunitinib as potent VEGFR-2 inhibitors.探究基于噻唑的舒尼替尼模拟物作为强效VEGFR-2抑制剂的结构要求。

RSC Med Chem. 2025 Jan 22. doi: 10.1039/d4md00754a.

Development of Drug-Induced Gene Expression Ranking Analysis (DIGERA) and Its Application to Virtual Screening for Poly (ADP-Ribose) Polymerase 1 Inhibitor.药物诱导基因表达排名分析（DIGERA）的开发及其在聚（ADP-核糖）聚合酶1抑制剂虚拟筛选中的应用

Int J Mol Sci. 2024 Dec 30;26(1):224. doi: 10.3390/ijms26010224.

Fungal secondary metabolites as a potential inhibitor of T315I- BCR::ABL1 mutant in chronic myeloid leukemia by molecular docking, molecular dynamics simulation and binding free energy exploration approaches.通过分子对接、分子动力学模拟和结合自由能探索方法，真菌次生代谢产物作为慢性髓性白血病中T315I - BCR::ABL1突变体的潜在抑制剂

J Genet Eng Biotechnol. 2024 Dec;22(4):100444. doi: 10.1016/j.jgeb.2024.100444. Epub 2024 Nov 20.

Unraveling Cordia myxa's anti-malarial potential: integrative insights from network pharmacology, molecular modeling, and machine learning.解析黄麻的抗疟潜力：网络药理学、分子建模和机器学习的综合见解。

BMC Infect Dis. 2024 Oct 19;24(1):1180. doi: 10.1186/s12879-024-10078-9.

Exploring compounds as potential inhibitors for allergen proteins: A systematic computational approach.探索作为变应原蛋白潜在抑制剂的化合物：一种系统的计算方法。

Heliyon. 2024 Jul 22;10(15):e34713. doi: 10.1016/j.heliyon.2024.e34713. eCollection 2024 Aug 15.

Structure-based pharmacophore modeling for precision inhibition of mutant ESR2 in breast cancer: A systematic computational approach.基于结构的药效团模型用于精准抑制乳腺癌中突变型 ESR2：一种系统的计算方法。

Cancer Med. 2024 Aug;13(15):e70074. doi: 10.1002/cam4.70074.

DrugMetric: quantitative drug-likeness scoring based on chemical space distance.DrugMetric：基于化学空间距离的定量类药性评分。

Brief Bioinform. 2024 May 23;25(4). doi: 10.1093/bib/bbae321.

本文引用的文献

Comprehensive Study on Molecular Supervised Learning with Graph Neural Networks.基于图神经网络的分子监督式学习综合研究。

J Chem Inf Model. 2020 Dec 28;60(12):5936-5945. doi: 10.1021/acs.jcim.0c00416. Epub 2020 Nov 8.

PubChem in 2021: new data content and improved web interfaces.PubChem 在 2021 年：新增数据内容和改进的网络界面。

Nucleic Acids Res. 2021 Jan 8;49(D1):D1388-D1395. doi: 10.1093/nar/gkaa971.

Artificial intelligence in drug discovery and development.人工智能在药物发现和开发中的应用。

Drug Discov Today. 2021 Jan;26(1):80-93. doi: 10.1016/j.drudis.2020.10.010. Epub 2020 Oct 21.

Evaluating Scalable Uncertainty Estimation Methods for Deep Learning-Based Molecular Property Prediction.评估基于深度学习的分子性质预测的可扩展不确定性估计方法。

J Chem Inf Model. 2020 Jun 22;60(6):2697-2717. doi: 10.1021/acs.jcim.9b00975. Epub 2020 Apr 24.

Distinguishing drug/non-drug-like small molecules in drug discovery using deep belief network.利用深度置信网络在药物发现中区分药物样/非药物样小分子。

Mol Divers. 2021 May;25(2):827-838. doi: 10.1007/s11030-020-10065-7. Epub 2020 Mar 19.

A Deep Learning Approach to Antibiotic Discovery.深度学习在抗生素发现中的应用。

Cell. 2020 Feb 20;180(4):688-702.e13. doi: 10.1016/j.cell.2020.01.021.

Modeling Physico-Chemical ADMET Endpoints with Multitask Graph Convolutional Networks.运用多任务图卷积网络模拟理化 ADMET 终点。

Molecules. 2019 Dec 21;25(1):44. doi: 10.3390/molecules25010044.

Rethinking drug design in the artificial intelligence era.人工智能时代的药物设计再思考。

Nat Rev Drug Discov. 2020 May;19(5):353-364. doi: 10.1038/s41573-019-0050-3. Epub 2019 Dec 4.

A drug-likeness toolbox facilitates ADMET study in drug discovery.药物类药性工具包有助于药物发现中的 ADMET 研究。

Drug Discov Today. 2020 Jan;25(1):248-258. doi: 10.1016/j.drudis.2019.10.014. Epub 2019 Nov 6.

Deep learning enables rapid identification of potent DDR1 kinase inhibitors.深度学习可快速鉴定有效的 DDR1 激酶抑制剂。

Nat Biotechnol. 2019 Sep;37(9):1038-1040. doi: 10.1038/s41587-019-0224-x. Epub 2019 Sep 2.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

基于无监督学习的类药性质评分

Drug-likeness scoring based on unsupervised learning.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献