Suppr超能文献

基于深度学习的化合物-靶点相互作用预测模型的全面比较,以揭示指导设计原则。

A comprehensive comparison of deep learning-based compound-target interaction prediction models to unveil guiding design principles.

作者信息

Abdollahi Sina, Schaub Darius P, Barroso Madalena, Laubach Nora C, Hutwelker Wiebke, Panzer Ulf, Gersting S Øren W, Bonn Stefan

机构信息

Institute of Medical Systems Biology, University Medical Center Hamburg-Eppendorf, Hamburg, 20251, Germany.

III. Department of Medicine, University Medical Center Hamburg-Eppendorf, Hamburg, 20251, Germany.

出版信息

J Cheminform. 2024 Oct 28;16(1):118. doi: 10.1186/s13321-024-00913-1.

Abstract

The evaluation of compound-target interactions (CTIs) is at the heart of drug discovery efforts. Given the substantial time and monetary costs of classical experimental screening, significant efforts have been dedicated to develop deep learning-based models that can accurately predict CTIs. A comprehensive comparison of these models on a large, curated CTI dataset is, however, still lacking. Here, we perform an in-depth comparison of 12 state-of-the-art deep learning architectures that use different protein and compound representations. The models were selected for their reported performance and architectures. To reliably compare model performance, we curated over 300 thousand binding and non-binding CTIs and established several gold-standard datasets of varying size and information. Based on our findings, DeepConv-DTI consistently outperforms other models in CTI prediction performance across the majority of datasets. It achieves an MCC of 0.6 or higher for most of the datasets and is one of the fastest models in training and inference. These results indicate that utilizing convolutional-based windows as in DeepConv-DTI to traverse trainable embeddings is a highly effective approach for capturing informative protein features. We also observed that physicochemical embeddings of targets increased model performance. We therefore modified DeepConv-DTI to include normalized physicochemical properties, which resulted in the overall best performing model Phys-DeepConv-DTI. This work highlights how the systematic evaluation of input features of compounds and targets, as well as their corresponding neural network architectures, can serve as a roadmap for the future development of improved CTI models.Scientific contributionThis work features comprehensive CTI datasets to allow for the objective comparison and benchmarking of CTI prediction algorithms. Based on this dataset, we gained insights into which embeddings of compounds and targets and which deep learning-based algorithms perform best, providing a blueprint for the future development of CTI algorithms. Using the insights gained from this screen, we provide a novel CTI algorithm with state-of-the-art performance.

摘要

化合物-靶点相互作用(CTIs)的评估是药物研发工作的核心。鉴于传统实验筛选的时间和资金成本巨大,人们已投入大量精力来开发能够准确预测CTIs的深度学习模型。然而,目前仍缺乏在一个大型的、经过整理的CTI数据集上对这些模型进行全面比较。在此,我们对12种使用不同蛋白质和化合物表示方法的先进深度学习架构进行了深入比较。这些模型因其已报道的性能和架构而被选中。为了可靠地比较模型性能,我们整理了超过30万个结合和非结合CTIs,并建立了几个不同大小和信息的金标准数据集。基于我们的研究结果,在大多数数据集中,DeepConv-DTI在CTI预测性能方面始终优于其他模型。它在大多数数据集中的马修斯相关系数(MCC)达到0.6或更高,并且是训练和推理速度最快的模型之一。这些结果表明,如在DeepConv-DTI中那样利用基于卷积的窗口来遍历可训练嵌入是捕获信息丰富的蛋白质特征的一种非常有效的方法。我们还观察到靶点的物理化学嵌入提高了模型性能。因此,我们对DeepConv-DTI进行了修改,使其包含标准化的物理化学性质,从而得到了整体性能最佳的模型Phys-DeepConv-DTI。这项工作突出了对化合物和靶点的输入特征及其相应神经网络架构进行系统评估如何能够为改进CTI模型的未来发展提供路线图。

科学贡献

这项工作具有全面的CTI数据集,可用于对CTI预测算法进行客观比较和基准测试。基于这个数据集,我们深入了解了哪些化合物和靶点的嵌入以及哪些基于深度学习的算法表现最佳,为CTI算法的未来发展提供了蓝图。利用从这次筛选中获得的见解,我们提供了一种具有先进性能的新型CTI算法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a2f4/11520803/799e3fc93cc6/13321_2024_913_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验