一个用于建立基于机器学习的定量构效关系（QSAR）模型的数据集，该模型使用FARM - BIOMOL化学文库筛选β-内酰胺酶抑制剂。

A dataset for machine learning-based QSAR models establishment to screen beta-lactamase inhibitors using the FARM -BIOMOL chemical library.

作者信息

Pitakbut Thanet, Munkert Jennifer, Xi Wenhui, Wei Yanjie, Fuhrmann Gregor

机构信息

Department of Biology, Pharmaceutical Biology, Friedrich-Alexander-Universität Erlangen-Nürnberg, Staudtstr. 5, 91058, Erlangen, Germany.

Shenzhen Key Laboratory of Intelligent Bioinformatics and Center for High-Performance Computing, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China.

出版信息

BMC Res Notes. 2025 Mar 3;18(1):91. doi: 10.1186/s13104-025-07159-6.

DOI:10.1186/s13104-025-07159-6

PMID:40033358

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11877915/

Abstract

OBJECTIVES

Beta-lactamase is a bacterial enzyme that deactivates beta-lactam antibiotics, and it is one of the leading causes of antibiotic resistance problems globally. In current drug discovery research, molecular simulation, like molecular docking, has been routinely integrated to virtually screen an enzyme inhibitory effect. However, a commonly known limitation of molecular docking is a low percent success rate. Previously, we reported a proof-of-concept of combining machine learning with a quantitative structure-activity relationship (QSAR) model that overcame this limitation ( https://doi.org/10.1186/s13065-024-01324-x ). Here, we presented and navigated the dataset used in our previous report, including sixty trained models (thirty for random forest and another thirty for logistic regression).

DATA DESCRIPTION

This data note has three essential parts. The first part is an in vitro beta-lactamase inhibitory screening of eighty-nine bioactive molecules. The second part consisted of three molecular docking approaches (AutoDock Vina, DOCK6, and consensus docking). The last part is machine learning integrated with QSAR models. Therefore, this data note is vital for further model development to increase performance.

摘要

目标

β-内酰胺酶是一种使β-内酰胺类抗生素失活的细菌酶，是全球抗生素耐药性问题的主要原因之一。在当前的药物发现研究中，分子模拟（如分子对接）已被常规用于虚拟筛选酶抑制作用。然而，分子对接一个众所周知的局限性是成功率较低。此前，我们报道了将机器学习与定量构效关系（QSAR）模型相结合的概念验证，克服了这一局限性（https://doi.org/10.1186/s13065-024-01324-x）。在此，我们展示并介绍了我们之前报告中使用的数据集，包括六十个训练模型（三十个随机森林模型和另外三十个逻辑回归模型）。

数据描述

本数据说明有三个重要部分。第一部分是对八十九种生物活性分子的体外β-内酰胺酶抑制筛选。第二部分由三种分子对接方法（AutoDock Vina、DOCK6和共识对接）组成。最后一部分是与QSAR模型集成的机器学习。因此，本数据说明对于进一步开发模型以提高性能至关重要。

相似文献

A dataset for machine learning-based QSAR models establishment to screen beta-lactamase inhibitors using the FARM -BIOMOL chemical library.一个用于建立基于机器学习的定量构效关系（QSAR）模型的数据集，该模型使用FARM - BIOMOL化学文库筛选β-内酰胺酶抑制剂。

BMC Res Notes. 2025 Mar 3;18(1):91. doi: 10.1186/s13104-025-07159-6.

Utilizing machine learning-based QSAR model to overcome standalone consensus docking limitation in beta-lactamase inhibitors screening: a proof-of-concept study.利用基于机器学习的定量构效关系（QSAR）模型克服β-内酰胺酶抑制剂筛选中独立共识对接的局限性：一项概念验证研究。

BMC Chem. 2024 Dec 20;18(1):249. doi: 10.1186/s13065-024-01324-x.

Differentiation of AmpC beta-lactamase binders vs. decoys using classification kNN QSAR modeling and application of the QSAR classifier to virtual screening.使用分类kNN QSAR模型区分AmpCβ-内酰胺酶结合剂与诱饵，并将QSAR分类器应用于虚拟筛选。

J Comput Aided Mol Des. 2008 Sep;22(9):593-609. doi: 10.1007/s10822-008-9199-2. Epub 2008 Mar 13.

Machine Learning Models Identify Inhibitors of New Delhi Metallo-β-lactamase.机器学习模型鉴定新德里金属β-内酰胺酶抑制剂。

J Chem Inf Model. 2024 May 27;64(10):3977-3991. doi: 10.1021/acs.jcim.3c02015. Epub 2024 May 10.

Molecular basis of the beta-lactamase protein using comparative modelling, drug screening and molecular dynamics studies to understand the resistance of β-lactam antibiotics.利用比较建模、药物筛选和分子动力学研究理解β-内酰胺类抗生素耐药性的β-内酰胺酶蛋白的分子基础。

J Mol Model. 2020 Jul 7;26(8):200. doi: 10.1007/s00894-020-04459-5.

Docking and Molecular Dynamic of Microalgae Compounds as Potential Inhibitors of Beta-Lactamase.藻类化合物作为β-内酰胺酶潜在抑制剂的对接和分子动力学。

Int J Mol Sci. 2022 Jan 31;23(3):1630. doi: 10.3390/ijms23031630.

Risedronate and Methotrexate Are High-Affinity Inhibitors of New Delhi Metallo-β-Lactamase-1 (NDM-1): A Drug Repurposing Approach.利塞膦酸盐和甲氨蝶呤是新德里金属β-内酰胺酶-1（NDM-1）的高亲和力抑制剂：药物再利用方法。

Molecules. 2022 Feb 14;27(4):1283. doi: 10.3390/molecules27041283.

N-(Sulfamoylbenzoyl)-L-proline Derivatives as Potential Non-β-lactam ESBL Inhibitors: Structure-Based Lead Identification, Medicinal Chemistry and Synergistic Antibacterial Activities.N-（氨磺酰苯甲酰基）-L-脯氨酸衍生物作为潜在的非β-内酰胺类超广谱β-内酰胺酶抑制剂：基于结构的先导化合物发现、药物化学及协同抗菌活性

Med Chem. 2019;15(2):196-206. doi: 10.2174/1573406414666180816123232.

A computational odyssey: uncovering classical β-lactamase inhibitors in dry fruits.一段计算之旅：在干生果中发现经典的β-内酰胺酶抑制剂。

J Biomol Struct Dyn. 2024 Jun;42(9):4578-4604. doi: 10.1080/07391102.2023.2220817. Epub 2023 Jun 8.

Structure-based virtual screening, molecular docking, and molecular dynamics simulation approaches for identification of new potential inhibitors of class a β-lactamase enzymes.基于结构的虚拟筛选、分子对接和分子动力学模拟方法鉴定 A 类β-内酰胺酶的新型潜在抑制剂。

J Biomol Struct Dyn. 2024 Jul;42(11):5631-5641. doi: 10.1080/07391102.2023.2227724. Epub 2023 Jun 26.

本文引用的文献

BMC Chem. 2024 Dec 20;18(1):249. doi: 10.1186/s13065-024-01324-x.

Combination of pose and rank consensus in docking-based virtual screening: the best of both worlds.基于对接的虚拟筛选中姿势和排名共识的结合：两全其美。

RSC Adv. 2021 Nov 2;11(56):35383-35391. doi: 10.1039/d1ra05785e. eCollection 2021 Oct 28.

Global burden of bacterial antimicrobial resistance in 2019: a systematic analysis.2019 年全球细菌对抗菌药物耐药性的负担：系统分析。

Lancet. 2022 Feb 12;399(10325):629-655. doi: 10.1016/S0140-6736(21)02724-0. Epub 2022 Jan 19.

AutoDock Vina 1.2.0: New Docking Methods, Expanded Force Field, and Python Bindings.AutoDock Vina 1.2.0：新的对接方法、扩展的力场及Python绑定

J Chem Inf Model. 2021 Aug 23;61(8):3891-3898. doi: 10.1021/acs.jcim.1c00203. Epub 2021 Jul 19.

β-Lactamases and β-Lactamase Inhibitors in the 21st Century.β-内酰胺酶与β-内酰胺酶抑制剂：21 世纪的挑战

J Mol Biol. 2019 Aug 23;431(18):3472-3500. doi: 10.1016/j.jmb.2019.04.002. Epub 2019 Apr 5.

Exponential consensus ranking improves the outcome in docking and receptor ensemble docking.指数一致排名可改善对接和受体组合对接的结果。

Sci Rep. 2019 Mar 26;9(1):5142. doi: 10.1038/s41598-019-41594-3.

Off-Target drug effects resulting in altered gene expression events with epigenetic and "Quasi-Epigenetic" origins.脱靶药物效应导致表观遗传和“准表观遗传”起源的基因表达事件改变。

Pharmacol Res. 2016 May;107:229-233. doi: 10.1016/j.phrs.2016.03.028. Epub 2016 Mar 26.

DOCK 6: Impact of new features and current docking performance.DOCK 6：新特性及当前对接性能的影响

J Comput Chem. 2015 Jun 5;36(15):1132-56. doi: 10.1002/jcc.23905.

PaDEL-descriptor: an open source software to calculate molecular descriptors and fingerprints.PaDEL-descriptor：一个开源软件，可用于计算分子描述符和指纹。

J Comput Chem. 2011 May;32(7):1466-74. doi: 10.1002/jcc.21707. Epub 2010 Dec 17.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验