基于预训练模型输出作为嵌入并基于结构感知交叉注意力进行特征融合的可解释药物-靶点亲和力预测。

Interpretable drug-target affinity prediction based on pre-trained models' output as embeddings and based on structure-aware cross-attention for feature fusion.

作者信息

Zheng Fang, Zhao Juanjuan, Yuan Zihang, Gao Yuanchen, Li Yafeng, Li Yaheng, Geng Yan, Qiang Yan

机构信息

College of Computer Science and Technology (College of Data Science), Taiyuan University of Technology, 209 University Street, Yuci District, Jinzhong, 030600, China.

School of Software, Taiyuan University of Technology, 209 University Street, Yuci District, Jinzhong, 030600, China.

出版信息

Mol Divers. 2025 Apr 25. doi: 10.1007/s11030-025-11194-7.

DOI:10.1007/s11030-025-11194-7

PMID:40279085

Abstract

The characteristics of protein pockets can better capture the interaction information between proteins and small molecules, thereby improving the performance of drug-target interaction (DTI) prediction tasks. However, pocket data typically need to be predicted using software such as AlphaFold, which would entail a massive workload for datasets ranging from tens of thousands to hundreds of thousands of samples. Moreover, feature representation networks for 3D pocket data are computationally intensive. To address this, we propose simulating 3D pocket data using sequence data through feature fusion of two different objects based on structure cross-attention (CASD). Additionally, precise feature representation is a prerequisite for accurately identifying pocket information. We introduce a method that leverages the output of the last layer of a pre-trained model as an embedding layer for training a new model from scratch. This approach not only incorporates prior knowledge from the pre-trained model but also expands model capacity, enabling more accurate feature representation. Furthermore, we enhance the multimodal representation of small molecule compounds using feature fusion based on structure cross-attention for the same object (CASS), further improving feature representation capabilities. Our cross-attention mechanisms operate at the token-level or node-level, allowing fine-grained capture of interactions between amino acids and atoms. This enables the identification of the contribution score of each atom or amino acid to the task, making our model interpretable for drug-target prediction. Experimental validation demonstrates that our model achieves state-of-the-art predictive performance.

摘要

蛋白质口袋的特征能够更好地捕捉蛋白质与小分子之间的相互作用信息，从而提高药物-靶点相互作用（DTI）预测任务的性能。然而，口袋数据通常需要使用诸如AlphaFold等软件进行预测，对于从数万到数十万样本的数据集而言，这将带来巨大的工作量。此外，用于三维口袋数据的特征表示网络计算量很大。为了解决这个问题，我们提出基于结构交叉注意力（CASD）通过对两个不同对象进行特征融合，利用序列数据模拟三维口袋数据。此外，精确的特征表示是准确识别口袋信息的先决条件。我们引入一种方法，利用预训练模型最后一层的输出作为从头开始训练新模型的嵌入层。这种方法不仅融合了预训练模型的先验知识，还扩展了模型容量，从而实现更准确的特征表示。此外，我们基于相同对象的结构交叉注意力（CASS）通过特征融合增强小分子化合物的多模态表示，进一步提高特征表示能力。我们的交叉注意力机制在token级别或节点级别运行，能够精细地捕捉氨基酸和原子之间的相互作用。这使得能够识别每个原子或氨基酸对任务的贡献分数，从而使我们的模型在药物-靶点预测方面具有可解释性。实验验证表明，我们的模型实现了当前最优的预测性能。

相似文献

Interpretable drug-target affinity prediction based on pre-trained models' output as embeddings and based on structure-aware cross-attention for feature fusion.

Mol Divers. 2025 Apr 25. doi: 10.1007/s11030-025-11194-7.

Short-Term Memory Impairment

Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.

Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.

Predicting Affinity Through Homology (PATH): Interpretable Binding Affinity Prediction with Persistent Homology.

bioRxiv. 2024 Oct 21:2023.11.16.567384. doi: 10.1101/2023.11.16.567384.

Leveraging a foundation model zoo for cell similarity search in oncological microscopy across devices.

Front Oncol. 2025 Jun 18;15:1480384. doi: 10.3389/fonc.2025.1480384. eCollection 2025.

A New Measure of Quantified Social Health Is Associated With Levels of Discomfort, Capability, and Mental and General Health Among Patients Seeking Musculoskeletal Specialty Care.

Clin Orthop Relat Res. 2025 Apr 1;483(4):647-663. doi: 10.1097/CORR.0000000000003394. Epub 2025 Feb 5.

Cost-effectiveness of using prognostic information to select women with breast cancer for adjuvant systemic therapy.

Health Technol Assess. 2006 Sep;10(34):iii-iv, ix-xi, 1-204. doi: 10.3310/hta10340.

A rapid and systematic review of the clinical effectiveness and cost-effectiveness of paclitaxel, docetaxel, gemcitabine and vinorelbine in non-small-cell lung cancer.

Health Technol Assess. 2001;5(32):1-195. doi: 10.3310/hta5320.

Are Current Survival Prediction Tools Useful When Treating Subsequent Skeletal-related Events From Bone Metastases?

Clin Orthop Relat Res. 2024 Sep 1;482(9):1710-1721. doi: 10.1097/CORR.0000000000003030. Epub 2024 Mar 22.

Influence of early through late fusion on pancreas segmentation from imperfectly registered multimodal magnetic resonance imaging.

J Med Imaging (Bellingham). 2025 Mar;12(2):024008. doi: 10.1117/1.JMI.12.2.024008. Epub 2025 Apr 26.

本文引用的文献

YoDe-Segmentation: automated noise-free retrieval of molecular structures from scientific publications.

J Cheminform. 2023 Nov 20;15(1):111. doi: 10.1186/s13321-023-00783-z.

AttentionMGT-DTA: A multi-modal drug-target affinity prediction using graph transformer and attention mechanism.

Neural Netw. 2024 Jan;169:623-636. doi: 10.1016/j.neunet.2023.11.018. Epub 2023 Nov 11.

Pmf-cpi: assessing drug selectivity with a pretrained multi-functional model for compound-protein interactions.

J Cheminform. 2023 Oct 14;15(1):97. doi: 10.1186/s13321-023-00767-z.

Contrastive learning in protein language space predicts interactions between drugs and protein targets.

Proc Natl Acad Sci U S A. 2023 Jun 13;120(24):e2220778120. doi: 10.1073/pnas.2220778120. Epub 2023 Jun 8.

Few-shot Molecular Property Prediction via Hierarchically Structured Learning on Relation Graphs.

Neural Netw. 2023 Jun;163:122-131. doi: 10.1016/j.neunet.2023.03.034. Epub 2023 Mar 30.

Cyclo[2]carbazole[2]pyrrole: a preorganized calix[4]pyrrole analogue.

Chem Sci. 2022 Dec 26;14(5):1218-1226. doi: 10.1039/d2sc06376j. eCollection 2023 Feb 1.

MolGPT: Molecular Generation Using a Transformer-Decoder Model.

J Chem Inf Model. 2022 May 9;62(9):2064-2076. doi: 10.1021/acs.jcim.1c00600. Epub 2021 Oct 25.

Toward better drug discovery with knowledge graph.

Curr Opin Struct Biol. 2022 Feb;72:114-126. doi: 10.1016/j.sbi.2021.09.003. Epub 2021 Oct 11.

Highly accurate protein structure prediction for the human proteome.

Nature. 2021 Aug;596(7873):590-596. doi: 10.1038/s41586-021-03828-1. Epub 2021 Jul 22.

MolTrans: Molecular Interaction Transformer for drug-target interaction prediction.

Bioinformatics. 2021 May 5;37(6):830-836. doi: 10.1093/bioinformatics/btaa880.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于预训练模型输出作为嵌入并基于结构感知交叉注意力进行特征融合的可解释药物-靶点亲和力预测。

Interpretable drug-target affinity prediction based on pre-trained models' output as embeddings and based on structure-aware cross-attention for feature fusion.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

本文引用的文献