Pitakbut Thanet, Munkert Jennifer, Xi Wenhui, Wei Yanjie, Fuhrmann Gregor
Department of Biology, Pharmaceutical Biology, Friedrich-Alexander-Universität Erlangen-Nürnberg, Staudtstr. 5, 91058, Erlangen, Germany.
Shenzhen Key Laboratory of Intelligent Bioinformatics and Center for High-Performance Computing, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China.
BMC Res Notes. 2025 Mar 3;18(1):91. doi: 10.1186/s13104-025-07159-6.
Beta-lactamase is a bacterial enzyme that deactivates beta-lactam antibiotics, and it is one of the leading causes of antibiotic resistance problems globally. In current drug discovery research, molecular simulation, like molecular docking, has been routinely integrated to virtually screen an enzyme inhibitory effect. However, a commonly known limitation of molecular docking is a low percent success rate. Previously, we reported a proof-of-concept of combining machine learning with a quantitative structure-activity relationship (QSAR) model that overcame this limitation ( https://doi.org/10.1186/s13065-024-01324-x ). Here, we presented and navigated the dataset used in our previous report, including sixty trained models (thirty for random forest and another thirty for logistic regression).
This data note has three essential parts. The first part is an in vitro beta-lactamase inhibitory screening of eighty-nine bioactive molecules. The second part consisted of three molecular docking approaches (AutoDock Vina, DOCK6, and consensus docking). The last part is machine learning integrated with QSAR models. Therefore, this data note is vital for further model development to increase performance.
β-内酰胺酶是一种使β-内酰胺类抗生素失活的细菌酶,是全球抗生素耐药性问题的主要原因之一。在当前的药物发现研究中,分子模拟(如分子对接)已被常规用于虚拟筛选酶抑制作用。然而,分子对接一个众所周知的局限性是成功率较低。此前,我们报道了将机器学习与定量构效关系(QSAR)模型相结合的概念验证,克服了这一局限性(https://doi.org/10.1186/s13065-024-01324-x)。在此,我们展示并介绍了我们之前报告中使用的数据集,包括六十个训练模型(三十个随机森林模型和另外三十个逻辑回归模型)。
本数据说明有三个重要部分。第一部分是对八十九种生物活性分子的体外β-内酰胺酶抑制筛选。第二部分由三种分子对接方法(AutoDock Vina、DOCK6和共识对接)组成。最后一部分是与QSAR模型集成的机器学习。因此,本数据说明对于进一步开发模型以提高性能至关重要。