• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于序列特征加权组合和 Boosting 多个 SVM 预测蛋白质-DNA 结合残基

Predicting Protein-DNA Binding Residues by Weightedly Combining Sequence-Based Features and Boosting Multiple SVMs.

出版信息

IEEE/ACM Trans Comput Biol Bioinform. 2017 Nov-Dec;14(6):1389-1398. doi: 10.1109/TCBB.2016.2616469. Epub 2016 Oct 11.

DOI:10.1109/TCBB.2016.2616469
PMID:27740495
Abstract

Protein-DNA interactions are ubiquitous in a wide variety of biological processes. Correctly locating DNA-binding residues solely from protein sequences is an important but challenging task for protein function annotations and drug discovery, especially in the post-genomic era where large volumes of protein sequences have quickly accumulated. In this study, we report a new predictor, named TargetDNA, for targeting protein-DNA binding residues from primary sequences. TargetDNA uses a protein's evolutionary information and its predicted solvent accessibility as two base features and employs a centered linear kernel alignment algorithm to learn the weights for weightedly combining the two features. Based on the weightedly combined feature, multiple initial predictors with SVM as classifiers are trained by applying a random under-sampling technique to the original dataset, the purpose of which is to cope with the severe imbalance phenomenon that exists between the number of DNA-binding and non-binding residues. The final ensembled predictor is obtained by boosting the multiple initially trained predictors. Experimental simulation results demonstrate that the proposed TargetDNA achieves a high prediction performance and outperforms many existing sequence-based protein-DNA binding residue predictors. The TargetDNA web server and datasets are freely available at http://csbio.njust.edu.cn/bioinf/TargetDNA/ for academic use.

摘要

蛋白质与 DNA 的相互作用在各种生物过程中普遍存在。仅从蛋白质序列正确定位 DNA 结合残基是蛋白质功能注释和药物发现的一项重要但具有挑战性的任务,特别是在后基因组时代,大量的蛋白质序列迅速积累。在这项研究中,我们报告了一种新的预测器,名为 TargetDNA,用于从原始序列中预测靶向蛋白质-DNA 结合残基。TargetDNA 使用蛋白质的进化信息及其预测的溶剂可及性作为两个基本特征,并采用中心线性核对齐算法来学习加权组合这两个特征的权重。基于加权组合特征,通过对原始数据集应用随机欠采样技术,使用支持向量机 (SVM) 作为分类器对多个初始预测器进行训练,其目的是应对 DNA 结合和非结合残基数量之间存在的严重不平衡现象。最终的集成预测器通过提升多个最初训练的预测器来获得。实验模拟结果表明,所提出的 TargetDNA 具有较高的预测性能,优于许多现有的基于序列的蛋白质-DNA 结合残基预测器。TargetDNA 网络服务器和数据集可在 http://csbio.njust.edu.cn/bioinf/TargetDNA/ 免费供学术使用。

相似文献

1
Predicting Protein-DNA Binding Residues by Weightedly Combining Sequence-Based Features and Boosting Multiple SVMs.基于序列特征加权组合和 Boosting 多个 SVM 预测蛋白质-DNA 结合残基
IEEE/ACM Trans Comput Biol Bioinform. 2017 Nov-Dec;14(6):1389-1398. doi: 10.1109/TCBB.2016.2616469. Epub 2016 Oct 11.
2
TargetDBP: Accurate DNA-Binding Protein Prediction Via Sequence-Based Multi-View Feature Learning.目标 DBP:基于序列的多视图特征学习的准确 DNA 结合蛋白预测。
IEEE/ACM Trans Comput Biol Bioinform. 2020 Jul-Aug;17(4):1419-1429. doi: 10.1109/TCBB.2019.2893634. Epub 2019 Jan 18.
3
DNAPred: Accurate Identification of DNA-Binding Sites from Protein Sequence by Ensembled Hyperplane-Distance-Based Support Vector Machines.DNAPred:基于超平面距离集成支持向量机的蛋白质序列 DNA 结合位点准确识别。
J Chem Inf Model. 2019 Jun 24;59(6):3057-3071. doi: 10.1021/acs.jcim.8b00749. Epub 2019 Apr 16.
4
Sequence-based prediction of DNA-binding residues in proteins with conservation and correlation information.基于序列的具有保守性和相关性信息的蛋白质 DNA 结合残基预测。
IEEE/ACM Trans Comput Biol Bioinform. 2012 Nov-Dec;9(6):1766-75. doi: 10.1109/TCBB.2012.106.
5
Enhancing protein-vitamin binding residues prediction by multiple heterogeneous subspace SVMs ensemble.通过多种异质子空间 SVM 集成来增强蛋白质-维生素结合残基预测。
BMC Bioinformatics. 2014 Sep 5;15(1):297. doi: 10.1186/1471-2105-15-297.
6
Designing template-free predictor for targeting protein-ligand binding sites with classifier ensemble and spatial clustering.设计无模板预测器,通过分类器集成和空间聚类来靶向蛋白质-配体结合位点。
IEEE/ACM Trans Comput Biol Bioinform. 2013 Jul-Aug;10(4):994-1008. doi: 10.1109/TCBB.2013.104.
7
A new supervised over-sampling algorithm with application to protein-nucleotide binding residue prediction.一种应用于蛋白质-核苷酸结合残基预测的新型监督过采样算法。
PLoS One. 2014 Sep 17;9(9):e107676. doi: 10.1371/journal.pone.0107676. eCollection 2014.
8
TargetFreeze: Identifying Antifreeze Proteins via a Combination of Weights using Sequence Evolutionary Information and Pseudo Amino Acid Composition.TargetFreeze:通过结合使用序列进化信息和伪氨基酸组成的权重来鉴定抗冻蛋白
J Membr Biol. 2015 Dec;248(6):1005-14. doi: 10.1007/s00232-015-9811-z. Epub 2015 Jun 10.
9
Boosting Granular Support Vector Machines for the Accurate Prediction of Protein-Nucleotide Binding Sites.增强粒状支持向量机以准确预测蛋白质-核苷酸结合位点
Comb Chem High Throughput Screen. 2019;22(7):455-469. doi: 10.2174/1386207322666190925125524.
10
TargetDBP+: Enhancing the Performance of Identifying DNA-Binding Proteins via Weighted Convolutional Features.TargetDBP+:通过加权卷积特征提高 DNA 结合蛋白识别性能。
J Chem Inf Model. 2021 Jan 25;61(1):505-515. doi: 10.1021/acs.jcim.0c00735. Epub 2021 Jan 7.

引用本文的文献

1
iProtDNA-SMOTE: Enhancing protein-DNA binding sites prediction through imbalanced graph neural networks.iProtDNA-SMOTE:通过不平衡图神经网络增强蛋白质-DNA结合位点预测
PLoS One. 2025 May 13;20(5):e0320817. doi: 10.1371/journal.pone.0320817. eCollection 2025.
2
BRAFPred: A Novel Approach for Accurate Prediction of the B-Type Rapidly Accelerated Fibrosarcoma Inhibitor.BRAFPred:一种准确预测B型快速加速纤维肉瘤抑制剂的新方法。
ACS Omega. 2025 Mar 21;10(12):12170-12184. doi: 10.1021/acsomega.4c10367. eCollection 2025 Apr 1.
3
TransBind allows precise detection of DNA-binding proteins and residues using language models and deep learning.
TransBind可利用语言模型和深度学习精确检测DNA结合蛋白和残基。
Commun Biol. 2025 Apr 5;8(1):568. doi: 10.1038/s42003-025-07534-w.
4
Twenty years of advances in prediction of nucleic acid-binding residues in protein sequences.蛋白质序列中核酸结合残基预测二十年进展
Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbaf016.
5
Protein-protein and protein-nucleic acid binding site prediction via interpretable hierarchical geometric deep learning.通过可解释的分层几何深度学习进行蛋白质-蛋白质和蛋白质-核酸结合位点预测。
Gigascience. 2024 Jan 2;13. doi: 10.1093/gigascience/giae080.
6
Advances in the Application of Protein Language Modeling for Nucleic Acid Protein Binding Site Prediction.蛋白质语言模型在核酸蛋白质结合位点预测中的应用进展。
Genes (Basel). 2024 Aug 18;15(8):1090. doi: 10.3390/genes15081090.
7
EGPDI: identifying protein-DNA binding sites based on multi-view graph embedding fusion.EGPDI:基于多视图图嵌入融合的蛋白质-DNA 结合位点识别。
Brief Bioinform. 2024 May 23;25(4). doi: 10.1093/bib/bbae330.
8
SOFB is a comprehensive ensemble deep learning approach for elucidating and characterizing protein-nucleic-acid-binding residues.SOFB 是一种全面的集成深度学习方法,用于阐明和描述蛋白质-核酸结合残基。
Commun Biol. 2024 Jun 3;7(1):679. doi: 10.1038/s42003-024-06332-0.
9
A comprehensive review of protein-centric predictors for biomolecular interactions: from proteins to nucleic acids and beyond.蛋白质中心预测因子在生物分子相互作用研究中的综合综述:从蛋白质到核酸及其他。
Brief Bioinform. 2024 Mar 27;25(3). doi: 10.1093/bib/bbae162.
10
DeepAVP-TPPred: identification of antiviral peptides using transformed image-based localized descriptors and binary tree growth algorithm.DeepAVP-TPPred:使用变换图像的局部描述符和二叉树生长算法鉴定抗病毒肽。
Bioinformatics. 2024 May 2;40(5). doi: 10.1093/bioinformatics/btae305.