Data Driven Drug Design, Center for Bioinformatics, Saarland University, Saarbrücken 66123, Germany.
Modeling and Simulation, Saarland University, Saarbrücken 66123, Germany.
J Chem Inf Model. 2024 May 27;64(10):4009-4020. doi: 10.1021/acs.jcim.4c00055. Epub 2024 May 15.
Drug discovery pipelines nowadays rely on machine learning models to explore and evaluate large chemical spaces. While including 3D structural information is considered beneficial, structural models are hindered by the availability of protein-ligand complex structures. Exemplified for kinase drug discovery, we address this issue by generating kinase-ligand complex data using template docking for the kinase compound subset of available ChEMBL assay data. To evaluate the benefit of the created complex data, we use it to train a structure-based (3)-invariant graph neural network. Our evaluation shows that binding affinities can be predicted with significantly higher precision by models that take synthetic binding poses into account compared to ligand- or drug-target interaction models alone.
当今的药物发现管道依赖于机器学习模型来探索和评估大型化学空间。虽然包含 3D 结构信息被认为是有益的,但结构模型受到蛋白质-配体复合物结构可用性的限制。以激酶药物发现为例,我们通过使用模板对接为可用的 ChEMBL 测定数据中的激酶化合物子集生成激酶-配体复合物数据来解决此问题。为了评估创建的复合物数据的益处,我们使用它来训练基于结构的(3)不变图神经网络。我们的评估表明,与仅考虑配体或药物靶标相互作用模型相比,考虑合成结合构象的模型可以显著提高结合亲和力预测的精度。