Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
Institute for Medical Engineering and Science and Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA.
Nature. 2024 Feb;626(7997):177-185. doi: 10.1038/s41586-023-06887-8. Epub 2023 Dec 20.
The discovery of novel structural classes of antibiotics is urgently needed to address the ongoing antibiotic resistance crisis. Deep learning approaches have aided in exploring chemical spaces; these typically use black box models and do not provide chemical insights. Here we reasoned that the chemical substructures associated with antibiotic activity learned by neural network models can be identified and used to predict structural classes of antibiotics. We tested this hypothesis by developing an explainable, substructure-based approach for the efficient, deep learning-guided exploration of chemical spaces. We determined the antibiotic activities and human cell cytotoxicity profiles of 39,312 compounds and applied ensembles of graph neural networks to predict antibiotic activity and cytotoxicity for 12,076,365 compounds. Using explainable graph algorithms, we identified substructure-based rationales for compounds with high predicted antibiotic activity and low predicted cytotoxicity. We empirically tested 283 compounds and found that compounds exhibiting antibiotic activity against Staphylococcus aureus were enriched in putative structural classes arising from rationales. Of these structural classes of compounds, one is selective against methicillin-resistant S. aureus (MRSA) and vancomycin-resistant enterococci, evades substantial resistance, and reduces bacterial titres in mouse models of MRSA skin and systemic thigh infection. Our approach enables the deep learning-guided discovery of structural classes of antibiotics and demonstrates that machine learning models in drug discovery can be explainable, providing insights into the chemical substructures that underlie selective antibiotic activity.
迫切需要发现新的抗生素结构类别,以应对持续的抗生素耐药性危机。深度学习方法有助于探索化学空间;这些方法通常使用黑盒模型,并且不能提供化学见解。在这里,我们推断出神经网络模型学习的与抗生素活性相关的化学子结构可以被识别并用于预测抗生素的结构类别。我们通过开发一种可解释的、基于子结构的方法来有效、深度学习指导化学空间探索,从而验证了这一假设。我们测定了 39312 种化合物的抗生素活性和人细胞细胞毒性谱,并应用图神经网络的集合来预测 12076365 种化合物的抗生素活性和细胞毒性。使用可解释的图算法,我们确定了基于子结构的理由,这些理由与具有高预测抗生素活性和低预测细胞毒性的化合物有关。我们对 283 种化合物进行了实证测试,发现具有抗金黄色葡萄球菌活性的化合物在合理产生的假定结构类别中富集。在这些化合物的结构类别中,有一种对耐甲氧西林金黄色葡萄球菌(MRSA)和耐万古霉素肠球菌具有选择性,逃避了大量耐药性,并降低了 MRSA 皮肤和全身大腿感染小鼠模型中的细菌滴度。我们的方法使基于深度学习的抗生素结构类别发现成为可能,并表明药物发现中的机器学习模型可以是可解释的,为选择性抗生素活性的化学子结构提供了见解。