• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过序列和结构数据的混合深度异构学习增强蛋白质-配体结合残基预测。

Protein-ligand binding residue prediction enhancement through hybrid deep heterogeneous learning of sequence and structure data.

机构信息

Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University.

Key Laboratory of System Control and Information Processing, Ministry of Education of China, 200240 Shanghai, China.

出版信息

Bioinformatics. 2020 May 1;36(10):3018-3027. doi: 10.1093/bioinformatics/btaa110.

DOI:10.1093/bioinformatics/btaa110
PMID:32091580
Abstract

MOTIVATION

Knowledge of protein-ligand binding residues is important for understanding the functions of proteins and their interaction mechanisms. From experimentally solved protein structures, how to accurately identify its potential binding sites of a specific ligand on the protein is still a challenging problem. Compared with structure-alignment-based methods, machine learning algorithms provide an alternative flexible solution which is less dependent on annotated homogeneous protein structures. Several factors are important for an efficient protein-ligand prediction model, e.g. discriminative feature representation and effective learning architecture to deal with both the large-scale and severely imbalanced data.

RESULTS

In this study, we propose a novel deep-learning-based method called DELIA for protein-ligand binding residue prediction. In DELIA, a hybrid deep neural network is designed to integrate 1D sequence-based features with 2D structure-based amino acid distance matrices. To overcome the problem of severe data imbalance between the binding and nonbinding residues, strategies of oversampling in mini-batch, random undersampling and stacking ensemble are designed to enhance the model. Experimental results on five benchmark datasets demonstrate the effectiveness of proposed DELIA pipeline.

AVAILABILITY AND IMPLEMENTATION

The web server of DELIA is available at www.csbio.sjtu.edu.cn/bioinf/delia/.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

了解蛋白质-配体结合残基对于理解蛋白质的功能及其相互作用机制非常重要。从实验确定的蛋白质结构中,如何准确识别其潜在的结合特定配体的蛋白质结合位点仍然是一个具有挑战性的问题。与基于结构比对的方法相比,机器学习算法提供了一种替代的灵活解决方案,其对带注释的同源蛋白质结构的依赖性较小。对于有效的蛋白质-配体预测模型,有几个因素很重要,例如有区分力的特征表示和有效的学习架构,以处理大规模和严重不平衡的数据。

结果

在这项研究中,我们提出了一种称为 DELIA 的基于深度学习的新方法,用于蛋白质-配体结合残基预测。在 DELIA 中,设计了一种混合深度神经网络,将 1D 基于序列的特征与 2D 基于结构的氨基酸距离矩阵集成在一起。为了克服结合残基和非结合残基之间严重的数据不平衡问题,设计了在 mini-batch 中过采样、随机欠采样和堆叠集成的策略来增强模型。在五个基准数据集上的实验结果证明了所提出的 DELIA 管道的有效性。

可用性和实现

DELIA 的网络服务器可在 www.csbio.sjtu.edu.cn/bioinf/delia/ 获得。

补充信息

补充数据可在 Bioinformatics 在线获得。

相似文献

1
Protein-ligand binding residue prediction enhancement through hybrid deep heterogeneous learning of sequence and structure data.通过序列和结构数据的混合深度异构学习增强蛋白质-配体结合残基预测。
Bioinformatics. 2020 May 1;36(10):3018-3027. doi: 10.1093/bioinformatics/btaa110.
2
BindWeb: A web server for ligand binding residue and pocket prediction from protein structures.BindWeb:一个从蛋白质结构预测配体结合残基和结合口袋的网络服务器。
Protein Sci. 2022 Dec;31(12):e4462. doi: 10.1002/pro.4462.
3
Hum-mPLoc 3.0: prediction enhancement of human protein subcellular localization through modeling the hidden correlations of gene ontology and functional domain features.Hum-mPLoc 3.0:通过对基因本体和功能域特征的隐藏相关性进行建模来增强人类蛋白质亚细胞定位预测
Bioinformatics. 2017 Mar 15;33(6):843-853. doi: 10.1093/bioinformatics/btw723.
4
Designing template-free predictor for targeting protein-ligand binding sites with classifier ensemble and spatial clustering.设计无模板预测器,通过分类器集成和空间聚类来靶向蛋白质-配体结合位点。
IEEE/ACM Trans Comput Biol Bioinform. 2013 Jul-Aug;10(4):994-1008. doi: 10.1109/TCBB.2013.104.
5
RNA-protein binding motifs mining with a new hybrid deep learning based cross-domain knowledge integration approach.基于新型混合深度学习跨域知识整合方法的RNA-蛋白质结合基序挖掘
BMC Bioinformatics. 2017 Feb 28;18(1):136. doi: 10.1186/s12859-017-1561-8.
6
A new supervised over-sampling algorithm with application to protein-nucleotide binding residue prediction.一种应用于蛋白质-核苷酸结合残基预测的新型监督过采样算法。
PLoS One. 2014 Sep 17;9(9):e107676. doi: 10.1371/journal.pone.0107676. eCollection 2014.
7
Constructing query-driven dynamic machine learning model with application to protein-ligand binding sites prediction.构建查询驱动的动态机器学习模型及其在蛋白质-配体结合位点预测中的应用。
IEEE Trans Nanobioscience. 2015 Jan;14(1):45-58. doi: 10.1109/TNB.2015.2394328.
8
R2C: improving ab initio residue contact map prediction using dynamic fusion strategy and Gaussian noise filter.R2C:使用动态融合策略和高斯噪声滤波器改进从头开始的残基接触图预测。
Bioinformatics. 2016 Aug 15;32(16):2435-43. doi: 10.1093/bioinformatics/btw181. Epub 2016 Apr 10.
9
HEMEsPred: Structure-Based Ligand-Specific Heme Binding Residues Prediction by Using Fast-Adaptive Ensemble Learning Scheme.HEMEsPred:基于结构的配体特异性血红素结合残基预测,采用快速自适应集成学习方案。
IEEE/ACM Trans Comput Biol Bioinform. 2018 Jan-Feb;15(1):147-156. doi: 10.1109/TCBB.2016.2615010. Epub 2016 Oct 4.
10
Topology Prediction Improvement of α-helical Transmembrane Proteins Through Helix-tail Modeling and Multiscale Deep Learning Fusion.通过螺旋-尾部建模和多尺度深度学习融合提高 α-螺旋跨膜蛋白的拓扑预测。
J Mol Biol. 2020 Feb 14;432(4):1279-1296. doi: 10.1016/j.jmb.2019.12.007. Epub 2019 Dec 21.

引用本文的文献

1
Predicting nucleic acid binding sites by attention map-guided graph convolutional network with protein language embeddings and physicochemical information.利用注意力图引导的图卷积网络结合蛋白质语言嵌入和物理化学信息预测核酸结合位点。
Brief Bioinform. 2025 Aug 31;26(5). doi: 10.1093/bib/bbaf457.
2
LABind: identifying protein binding ligand-aware sites via learning interactions between ligand and protein.LABind:通过学习配体与蛋白质之间的相互作用来识别蛋白质结合配体感知位点。
Nat Commun. 2025 Aug 19;16(1):7712. doi: 10.1038/s41467-025-62899-0.
3
Leveraging large language models for literature-driven prioritization of protein binding pockets.
利用大语言模型对蛋白质结合口袋进行文献驱动的优先级排序。
Bioinformatics. 2025 Aug 2;41(8). doi: 10.1093/bioinformatics/btaf449.
4
StackGlyEmbed: prediction of N-linked glycosylation sites using protein language models.StackGlyEmbed:使用蛋白质语言模型预测N-糖基化位点
Bioinform Adv. 2025 Jun 28;5(1):vbaf146. doi: 10.1093/bioadv/vbaf146. eCollection 2025.
5
RTK_RAG: Leveraging Retrieval Augmented Generation with Multi-Window Convolutional Neural Networks for Superior ATP Binding Site Prediction in Receptor Tyrosine Kinases.RTK_RAG:利用多窗口卷积神经网络的检索增强生成技术实现受体酪氨酸激酶中ATP结合位点的卓越预测。
J Chem Inf Model. 2025 Jul 14;65(13):7277-7284. doi: 10.1021/acs.jcim.5c00766. Epub 2025 Jun 18.
6
Improving Identification of Drug-Target Binding Sites Based on Structures of Targets Using Residual Graph Transformer Network.基于靶点结构利用残差图变换器网络改进药物-靶点结合位点的识别
Biomolecules. 2025 Feb 3;15(2):221. doi: 10.3390/biom15020221.
7
Machine learning approaches for predicting protein-ligand binding sites from sequence data.从序列数据预测蛋白质-配体结合位点的机器学习方法。
Front Bioinform. 2025 Feb 3;5:1520382. doi: 10.3389/fbinf.2025.1520382. eCollection 2025.
8
CaBind_MCNN: Identifying Potential Calcium Channel Blocker Targets by Predicting Calcium-Binding Sites in Ion Channels and Ion Transporters Using Protein Language Models and Multiscale Feature Extraction.CaBind_MCNN:利用蛋白质语言模型和多尺度特征提取预测离子通道和离子转运体中的钙结合位点,以识别潜在的钙通道阻滞剂靶点。
J Chem Inf Model. 2025 Feb 24;65(4):2145-2157. doi: 10.1021/acs.jcim.4c02252. Epub 2025 Feb 6.
9
Predicting the location of coordinated metal ion-ligand binding sites using geometry-aware graph neural networks.使用几何感知图神经网络预测配位金属离子-配体结合位点的位置。
Comput Struct Biotechnol J. 2024 Dec 21;27:137-148. doi: 10.1016/j.csbj.2024.12.016. eCollection 2025.
10
A Point Cloud Graph Neural Network for Protein-Ligand Binding Site Prediction.基于点云图神经网络的蛋白质-配体结合位点预测
Int J Mol Sci. 2024 Aug 27;25(17):9280. doi: 10.3390/ijms25179280.