UCIBIO-REQUIMTE, Department of Chemistry and Department of Life Sciences, Faculty of Science and Technology, Universidade NOVA de Lisboa, 2829-516 Caparica, Portugal.
LAQV-REQUIMTE, Department of Chemistry, Faculty of Science and Technology, Universidade NOVA de Lisboa, 2829-516 Caparica, Portugal.
Mar Drugs. 2018 Dec 28;17(1):16. doi: 10.3390/md17010016.
The risk of methicillin-resistant (MRSA) infection is increasing in both the developed and developing countries. New approaches to overcome this problem are in need. A ligand-based strategy to discover new inhibiting agents against MRSA infection was built through exploration of machine learning techniques. This strategy is based in two quantitative structure⁻activity relationship (QSAR) studies, one using molecular descriptors (approach A) and the other using descriptors (approach B). In the approach A, regression models were developed using a total of 6645 molecules that were extracted from the ChEMBL, PubChem and ZINC databases, and recent literature. The performance of the regression models was successfully evaluated by internal and external validation, the best model achieved R² of 0.68 and RMSE of 0.59 for the test set. In general natural product (NP) drug discovery is a time-consuming process and several strategies for dereplication have been developed to overcome this inherent limitation. In the approach B, we developed a new NP drug discovery methodology that consists in frontloading samples with 1D NMR descriptors to predict compounds with antibacterial activity prior to bioactivity screening for NPs discovery. The NMR QSAR classification models were built using 1D NMR data (¹H and C) as descriptors, from crude extracts, fractions and pure compounds obtained from actinobacteria isolated from marine sediments collected off the Madeira Archipelago. The overall predictability accuracies of the best model exceeded 77% for both training and test sets.
耐甲氧西林金黄色葡萄球菌(MRSA)感染的风险在发达国家和发展中国家都在增加。需要新的方法来克服这个问题。通过探索机器学习技术,构建了一种基于配体的策略来发现针对 MRSA 感染的新抑制剂。该策略基于两项定量构效关系(QSAR)研究,一项使用分子描述符(方法 A),另一项使用描述符(方法 B)。在方法 A 中,使用从 ChEMBL、PubChem 和 ZINC 数据库以及最近的文献中提取的总共 6645 个分子开发了回归模型。通过内部和外部验证成功评估了回归模型的性能,最佳模型在测试集中的 R²为 0.68,RMSE 为 0.59。一般来说,天然产物(NP)药物发现是一个耗时的过程,已经开发了几种去重策略来克服这种固有限制。在方法 B 中,我们开发了一种新的 NP 药物发现方法,该方法包括在生物活性筛选之前,使用 1D NMR 描述符对样品进行预加载,以预测具有抗菌活性的化合物,从而发现 NP。使用 1D NMR 数据(¹H 和 C)作为描述符构建 NMR QSAR 分类模型,这些数据来自从马德拉群岛附近采集的海洋沉积物中分离出的放线菌中获得的粗提物、馏分和纯化合物。最佳模型在训练集和测试集上的总体预测准确性均超过 77%。