基于序列的蛋白质结合区域和药物-靶点相互作用预测。

Sequence-based prediction of protein binding regions and drug-target interactions.

作者信息

Lee Ingoo, Nam Hojung

机构信息

School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology, 123 Cheomdangwagi-ro, Buk-ku, Gwangju, 61005, Republic of Korea.

出版信息

J Cheminform. 2022 Feb 8;14(1):5. doi: 10.1186/s13321-022-00584-w.

DOI:10.1186/s13321-022-00584-w

PMID:35135622

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8822694/

Abstract

Identifying drug-target interactions (DTIs) is important for drug discovery. However, searching all drug-target spaces poses a major bottleneck. Therefore, recently many deep learning models have been proposed to address this problem. However, the developers of these deep learning models have neglected interpretability in model construction, which is closely related to a model's performance. We hypothesized that training a model to predict important regions on a protein sequence would increase DTI prediction performance and provide a more interpretable model. Consequently, we constructed a deep learning model, named Highlights on Target Sequences (HoTS), which predicts binding regions (BRs) between a protein sequence and a drug ligand, as well as DTIs between them. To train the model, we collected complexes of protein-ligand interactions and protein sequences of binding sites and pretrained the model to predict BRs for a given protein sequence-ligand pair via object detection employing transformers. After pretraining the BR prediction, we trained the model to predict DTIs from a compound token designed to assign attention to BRs. We confirmed that training the BRs prediction model indeed improved the DTI prediction performance. The proposed HoTS model showed good performance in BR prediction on independent test datasets even though it does not use 3D structure information in its prediction. Furthermore, the HoTS model achieved the best performance in DTI prediction on test datasets. Additional analysis confirmed the appropriate attention for BRs and the importance of transformers in BR and DTI prediction. The source code is available on GitHub ( https://github.com/GIST-CSBL/HoTS ).

摘要

识别药物-靶点相互作用（DTIs）对于药物发现至关重要。然而，搜索所有的药物-靶点空间构成了一个主要瓶颈。因此，最近人们提出了许多深度学习模型来解决这个问题。然而，这些深度学习模型的开发者在模型构建过程中忽视了可解释性，而这与模型的性能密切相关。我们假设训练一个模型来预测蛋白质序列上的重要区域会提高DTI预测性能，并提供一个更具可解释性的模型。因此，我们构建了一个名为“靶点序列亮点”（HoTS）的深度学习模型，该模型可以预测蛋白质序列与药物配体之间的结合区域（BRs）以及它们之间的DTIs。为了训练该模型，我们收集了蛋白质-配体相互作用的复合物和结合位点的蛋白质序列，并通过使用Transformer的目标检测对模型进行预训练，以预测给定蛋白质序列-配体对的BRs。在对BR预测进行预训练之后，我们训练模型从一个被设计用于关注BRs的化合物标记来预测DTIs。我们证实训练BR预测模型确实提高了DTI预测性能。所提出的HoTS模型在独立测试数据集上的BR预测中表现良好，尽管它在预测中没有使用三维结构信息。此外，HoTS模型在测试数据集上的DTI预测中取得了最佳性能。进一步的分析证实了对BRs的适当关注以及Transformer在BR和DTI预测中的重要性。源代码可在GitHub上获取（https://github.com/GIST-CSBL/HoTS）。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a5bf/8822694/384764c3eaf0/13321_2022_584_Fig1_HTML.jpg

相似文献

Sequence-based prediction of protein binding regions and drug-target interactions.

J Cheminform. 2022 Feb 8;14(1):5. doi: 10.1186/s13321-022-00584-w.

DeepConv-DTI: Prediction of drug-target interactions via deep learning with convolution on protein sequences.

PLoS Comput Biol. 2019 Jun 14;15(6):e1007129. doi: 10.1371/journal.pcbi.1007129. eCollection 2019 Jun.

AttentionDTA: Drug-Target Binding Affinity Prediction by Sequence-Based Deep Learning With Attention Mechanism.

IEEE/ACM Trans Comput Biol Bioinform. 2023 Mar-Apr;20(2):852-863. doi: 10.1109/TCBB.2022.3170365. Epub 2023 Apr 3.

GraphormerDTI: A graph transformer-based approach for drug-target interaction prediction.

Comput Biol Med. 2024 May;173:108339. doi: 10.1016/j.compbiomed.2024.108339. Epub 2024 Mar 18.

DTI-CDF: a cascade deep forest model towards the prediction of drug-target interactions based on hybrid features.

Brief Bioinform. 2021 Jan 18;22(1):451-462. doi: 10.1093/bib/bbz152.

CoaDTI: multi-modal co-attention based framework for drug-target interaction annotation.

Brief Bioinform. 2022 Nov 19;23(6). doi: 10.1093/bib/bbac446.

How to approach machine learning-based prediction of drug/compound-target interactions.

J Cheminform. 2023 Feb 6;15(1):16. doi: 10.1186/s13321-023-00689-w.

ICAN: Interpretable cross-attention network for identifying drug and target protein interactions.

PLoS One. 2022 Oct 24;17(10):e0276609. doi: 10.1371/journal.pone.0276609. eCollection 2022.

DTITR: End-to-end drug-target binding affinity prediction with transformers.

Comput Biol Med. 2022 Aug;147:105772. doi: 10.1016/j.compbiomed.2022.105772. Epub 2022 Jun 21.

Semi-supervised heterogeneous graph contrastive learning for drug-target interaction prediction.

Comput Biol Med. 2023 Sep;163:107199. doi: 10.1016/j.compbiomed.2023.107199. Epub 2023 Jun 22.

引用本文的文献

Evidential deep learning-based drug-target interaction prediction.

Nat Commun. 2025 Jul 26;16(1):6915. doi: 10.1038/s41467-025-62235-6.

AI-guided discovery and optimization of antimicrobial peptides through species-aware language model.

Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf343.

Charting γ-secretase substrates by explainable AI.

Nat Commun. 2025 Jul 1;16(1):5428. doi: 10.1038/s41467-025-60638-z.

RBI: a novel algorithm for regulatory-metabolic network model in designing the optimal mutant strain.

PeerJ Comput Sci. 2025 May 27;11:e2880. doi: 10.7717/peerj-cs.2880. eCollection 2025.

GNNSeq: A Sequence-Based Graph Neural Network for Predicting Protein-Ligand Binding Affinity.

Pharmaceuticals (Basel). 2025 Feb 26;18(3):329. doi: 10.3390/ph18030329.

Deep-ProBind: binding protein prediction with transformer-based deep learning model.

BMC Bioinformatics. 2025 Mar 22;26(1):88. doi: 10.1186/s12859-025-06101-8.

Natural Language Processing Methods for the Study of Protein-Ligand Interactions.

J Chem Inf Model. 2025 Mar 10;65(5):2191-2213. doi: 10.1021/acs.jcim.4c01907. Epub 2025 Feb 24.

Machine learning approaches for predicting protein-ligand binding sites from sequence data.

Front Bioinform. 2025 Feb 3;5:1520382. doi: 10.3389/fbinf.2025.1520382. eCollection 2025.

Exploring the potential of compound-protein complex structure-free models in virtual screening using BlendNet.

Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbae712.

SP-DTI: subpocket-informed transformer for drug-target interaction prediction.

Bioinformatics. 2025 Mar 4;41(3). doi: 10.1093/bioinformatics/btaf011.

本文引用的文献

MSA-Regularized Protein Sequence Transformer toward Predicting Genome-Wide Chemical-Protein Interactions: Application to GPCRome Deorphanization.

J Chem Inf Model. 2021 Apr 26;61(4):1570-1582. doi: 10.1021/acs.jcim.0c01285. Epub 2021 Mar 23.

DeepSurf: a surface-based deep learning approach for the prediction of ligand binding sites on proteins.

Bioinformatics. 2021 Jul 19;37(12):1681-1690. doi: 10.1093/bioinformatics/btab009.

Evaluation of deep and shallow learning methods in chemogenomics for the prediction of drugs specificity.

J Cheminform. 2020 Feb 10;12(1):11. doi: 10.1186/s13321-020-0413-0.

Proteochemometrics - recent developments in bioactivity and selectivity modeling.

Drug Discov Today Technol. 2019 Dec;32-33:89-98. doi: 10.1016/j.ddtec.2020.08.003. Epub 2020 Sep 20.

Explainable Deep Relational Networks for Predicting Compound-Protein Affinities and Contacts.

J Chem Inf Model. 2021 Jan 25;61(1):46-66. doi: 10.1021/acs.jcim.0c00866. Epub 2020 Dec 21.

MolTrans: Molecular Interaction Transformer for drug-target interaction prediction.

Bioinformatics. 2021 May 5;37(6):830-836. doi: 10.1093/bioinformatics/btaa880.

TransformerCPI: improving compound-protein interaction prediction by sequence-based deep learning with self-attention mechanism and label reversal experiments.

Bioinformatics. 2020 Aug 15;36(16):4406-4414. doi: 10.1093/bioinformatics/btaa524.

Getting to Know Your Neighbor: Protein Structure Prediction Comes of Age with Contextual Machine Learning.

J Comput Biol. 2020 May;27(5):796-814. doi: 10.1089/cmb.2019.0193. Epub 2019 Aug 30.

DeepConv-DTI: Prediction of drug-target interactions via deep learning with convolution on protein sequences.

PLoS Comput Biol. 2019 Jun 14;15(6):e1007129. doi: 10.1371/journal.pcbi.1007129. eCollection 2019 Jun.

PrankWeb: a web server for ligand binding site prediction and visualization.

Nucleic Acids Res. 2019 Jul 2;47(W1):W345-W349. doi: 10.1093/nar/gkz424.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于序列的蛋白质结合区域和药物-靶点相互作用预测。

Sequence-based prediction of protein binding regions and drug-target interactions.

作者信息

Lee Ingoo, Nam Hojung

机构信息

School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology, 123 Cheomdangwagi-ro, Buk-ku, Gwangju, 61005, Republic of Korea.

出版信息

J Cheminform. 2022 Feb 8;14(1):5. doi: 10.1186/s13321-022-00584-w.

DOI:10.1186/s13321-022-00584-w

PMID:35135622

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8822694/

Abstract

摘要

基于序列的蛋白质结合区域和药物-靶点相互作用预测。

Sequence-based prediction of protein binding regions and drug-target interactions.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

基于序列的蛋白质结合区域和药物-靶点相互作用预测。

Sequence-based prediction of protein binding regions and drug-target interactions.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献