Department of Biochemistry and Molecular Pharmacology , University of Massachusetts Medical School , Worcester , Massachusetts 01605 , United States.
J Chem Inf Model. 2019 Sep 23;59(9):3679-3691. doi: 10.1021/acs.jcim.9b00457. Epub 2019 Aug 19.
Discovery and optimization of small molecule inhibitors as therapeutic drugs have immensely benefited from rational structure-based drug design. With recent advances in high-resolution structure determination, computational power, and machine learning methodology, it is becoming more tractable to elucidate the structural basis of drug potency. However, the applicability of machine learning models to drug design is limited by the interpretability of the resulting models in terms of feature importance. Here, we take advantage of the large number of available inhibitor-bound HIV-1 protease structures and associated potencies to evaluate inhibitor diversity and machine learning models to predict ligand affinity. First, using a hierarchical clustering approach, we grouped HIV-1 protease inhibitors and identified distinct core structures. Explicit features including protein-ligand interactions were extracted from high-resolution cocrystal structures as 3D-based fingerprints. We found that a gradient boosting machine learning model with this explicit feature attribution can predict binding affinity with high accuracy. Finally, Shapley values were derived to explain local feature importance. We found specific van der Waals (vdW) interactions of key protein residues are pivotal for the predicted potency. Protein-specific and interpretable prediction models can guide the optimization of many small molecule drugs for improved potency.
从理性的基于结构的药物设计中,小分子抑制剂作为治疗药物的发现和优化受益匪浅。随着高分辨率结构测定、计算能力和机器学习方法的最新进展,阐明药物效力的结构基础变得更加可行。然而,机器学习模型在药物设计中的适用性受到模型在特征重要性方面的可解释性的限制。在这里,我们利用大量可用的抑制剂结合 HIV-1 蛋白酶结构和相关效力来评估抑制剂的多样性和机器学习模型来预测配体亲和力。首先,我们使用层次聚类方法对 HIV-1 蛋白酶抑制剂进行分组,并确定了不同的核心结构。从高分辨率共晶结构中提取出包括蛋白-配体相互作用在内的显式特征作为 3D 基指纹。我们发现,具有这种显式特征归因的梯度提升机学习模型可以高精度地预测结合亲和力。最后,导出了 Shapley 值来解释局部特征重要性。我们发现关键蛋白质残基的特定范德华(vdW)相互作用对预测的效力至关重要。具有蛋白质特异性和可解释性的预测模型可以指导许多小分子药物的优化,以提高效力。