DNA 结合蛋白结合位点的特性和预测：通过结合残基组成、进化保守性和结构参数来提高准确性。

Characterization and prediction of the binding site in DNA-binding proteins: improvement of accuracy by combining residue composition, evolutionary conservation and structural parameters.

机构信息

Bioinformatics Centre, Bose Institute, P-1/12 CIT Scheme VIIM, Kolkata 700 054, India.

出版信息

Nucleic Acids Res. 2012 Aug;40(15):7150-61. doi: 10.1093/nar/gks405. Epub 2012 May 27.

DOI:10.1093/nar/gks405

PMID:22641851

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3424558/

Abstract

We present a set of four parameters that in combination can predict DNA-binding residues on protein structures to a high degree of accuracy. These are the number of evolutionary conserved residues (N(cons)) and their spatial clustering (ρ(e)), hydrogen bond donor capability (D(p)) and residue propensity (R(p)). We first used these parameters to characterize 130 interfaces in a set of 126 DNA-binding proteins (DBPs). The applicability of these parameters both individually and in combination, to distinguish the true binding region from the rest of the protein surface was then analyzed. R(p) shows the best performance identifying the true interface with the top rank in 83% cases. Importantly, we also used the unbound-bound test cases of the protein-DNA docking benchmark to test the efficacy of our method. When applied to the unbound form of the DBPs, R(p) can distinguish 86% cases. Finally, we have applied the SVM approach for recognizing the interface region using the above parameters along with the individual amino acid composition as attributes. The accuracy of prediction is 90.5% for the bound structures and 93.6% for the unbound form of the proteins.

摘要

我们提出了一组四个参数，它们结合起来可以高度准确地预测蛋白质结构上的 DNA 结合残基。这些参数是进化保守残基的数量 (N(cons)) 和它们的空间聚类 (ρ(e))、氢键供体能力 (D(p)) 和残基倾向 (R(p))。我们首先使用这些参数来描述 126 个 DNA 结合蛋白 (DBP) 中的 130 个界面。然后分析了这些参数单独和组合使用的适用性，以区分真实的结合区域和蛋白质表面的其余部分。在识别真实界面方面，R(p) 的表现最好，在 83%的情况下排名第一。重要的是，我们还使用蛋白质-DNA 对接基准测试的未结合-结合测试案例来测试我们方法的效果。当应用于 DBP 的未结合形式时，R(p) 可以区分 86%的情况。最后，我们应用 SVM 方法使用上述参数以及单个氨基酸组成作为属性来识别界面区域。对于结合结构，预测的准确性为 90.5%，对于蛋白质的未结合形式，预测的准确性为 93.6%。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e2a2/3424558/0013918fbd14/gks405f1.jpg

相似文献

Characterization and prediction of the binding site in DNA-binding proteins: improvement of accuracy by combining residue composition, evolutionary conservation and structural parameters.

Nucleic Acids Res. 2012 Aug;40(15):7150-61. doi: 10.1093/nar/gks405. Epub 2012 May 27.

Residue-level prediction of DNA-binding sites and its application on DNA-binding protein predictions.

FEBS Lett. 2007 Mar 6;581(5):1058-66. doi: 10.1016/j.febslet.2007.01.086. Epub 2007 Feb 7.

Protein-DNA interactions: structural, thermodynamic and clustering patterns of conserved residues in DNA-binding proteins.

Nucleic Acids Res. 2008 Oct;36(18):5922-32. doi: 10.1093/nar/gkn573. Epub 2008 Sep 18.

Structural changes in DNA-binding proteins on complexation.

Nucleic Acids Res. 2018 Apr 20;46(7):3298-3308. doi: 10.1093/nar/gky170.

Analyses on clustering of the conserved residues at protein-RNA interfaces and its application in binding site identification.

BMC Bioinformatics. 2020 Feb 17;21(1):57. doi: 10.1186/s12859-020-3398-9.

Identifying RNA-binding residues based on evolutionary conserved structural and energetic features.

Nucleic Acids Res. 2014 Feb;42(3):e15. doi: 10.1093/nar/gkt1299. Epub 2013 Dec 16.

DISPLAR: an accurate method for predicting DNA-binding sites on protein surfaces.

Nucleic Acids Res. 2007;35(5):1465-77. doi: 10.1093/nar/gkm008. Epub 2007 Feb 6.

Protein-RNA interface residue prediction using machine learning: an assessment of the state of the art.

BMC Bioinformatics. 2012 May 10;13:89. doi: 10.1186/1471-2105-13-89.

DP-BINDER: machine learning model for prediction of DNA-binding proteins by fusing evolutionary and physicochemical information.

J Comput Aided Mol Des. 2019 Jul;33(7):645-658. doi: 10.1007/s10822-019-00207-x. Epub 2019 May 23.

Prediction of protein-protein binding site by using core interface residue and support vector machine.

BMC Bioinformatics. 2008 Dec 22;9:553. doi: 10.1186/1471-2105-9-553.

引用本文的文献

A comprehensive review of protein-centric predictors for biomolecular interactions: from proteins to nucleic acids and beyond.

Brief Bioinform. 2024 Mar 27;25(3). doi: 10.1093/bib/bbae162.

EPDRNA: A Model for Identifying DNA-RNA Binding Sites in Disease-Related Proteins.

Protein J. 2024 Jun;43(3):513-521. doi: 10.1007/s10930-024-10183-3. Epub 2024 Mar 16.

ProDFace: A web-tool for the dissection of protein-DNA interfaces.

Front Mol Biosci. 2022 Sep 6;9:978310. doi: 10.3389/fmolb.2022.978310. eCollection 2022.

DeepDISE: DNA Binding Site Prediction Using a Deep Learning Method.

Int J Mol Sci. 2021 May 24;22(11):5510. doi: 10.3390/ijms22115510.

Multiple protein-DNA interfaces unravelled by evolutionary information, physico-chemical and geometrical properties.

PLoS Comput Biol. 2020 Feb 3;16(2):e1007624. doi: 10.1371/journal.pcbi.1007624. eCollection 2020 Feb.

Individually double minimum-distance definition of protein-RNA binding residues and application to structure-based prediction.

J Comput Aided Mol Des. 2018 Dec;32(12):1363-1373. doi: 10.1007/s10822-018-0177-z. Epub 2018 Nov 26.

Structural changes in DNA-binding proteins on complexation.

Nucleic Acids Res. 2018 Apr 20;46(7):3298-3308. doi: 10.1093/nar/gky170.

Analysis and prediction of single-stranded and double-stranded DNA binding proteins based on protein sequences.

BMC Bioinformatics. 2017 Jun 12;18(1):300. doi: 10.1186/s12859-017-1715-8.

Systematic Analyses and Prediction of Human Drug Side Effect Associated Proteins from the Perspective of Protein Evolution.

Genome Biol Evol. 2017 Feb 1;9(2):337-350. doi: 10.1093/gbe/evw301.

A Large-Scale Assessment of Nucleic Acids Binding Site Prediction Programs.

PLoS Comput Biol. 2015 Dec 17;11(12):e1004639. doi: 10.1371/journal.pcbi.1004639. eCollection 2015 Dec.

本文引用的文献

Prediction of dinucleotide-specific RNA-binding sites in proteins.

BMC Bioinformatics. 2011;12 Suppl 13(Suppl 13):S5. doi: 10.1186/1471-2105-12-S13-S5. Epub 2011 Nov 30.

Predicting target DNA sequences of DNA-binding proteins based on unbound structures.

PLoS One. 2012;7(2):e30446. doi: 10.1371/journal.pone.0030446. Epub 2012 Feb 1.

Exploiting a reduced set of weighted average features to improve prediction of DNA-binding residues from 3D structures.

PLoS One. 2011;6(12):e28440. doi: 10.1371/journal.pone.0028440. Epub 2011 Dec 8.

Predicting nucleic acid binding interfaces from structural models of proteins.

Proteins. 2012 Feb;80(2):482-9. doi: 10.1002/prot.23214. Epub 2011 Nov 16.

From face to interface recognition: a differential geometric approach to distinguish DNA from RNA binding surfaces.

Nucleic Acids Res. 2011 Sep 1;39(17):7390-9. doi: 10.1093/nar/gkr395. Epub 2011 Jun 21.

MetaDBSite: a meta approach to improve protein DNA-binding sites prediction.

BMC Syst Biol. 2011 Jun 20;5 Suppl 1(Suppl 1):S7. doi: 10.1186/1752-0509-5-S1-S7.

Analysis of electric moments of RNA-binding proteins: implications for mechanism and prediction.

BMC Struct Biol. 2011 Feb 1;11:8. doi: 10.1186/1472-6807-11-8.

Discovering approximate-associated sequence patterns for protein-DNA interactions.

Bioinformatics. 2011 Feb 15;27(4):471-8. doi: 10.1093/bioinformatics/btq682. Epub 2010 Dec 30.

Structure-based prediction of RNA-binding domains and RNA-binding sites and application to structural genomics targets.

Nucleic Acids Res. 2011 Apr;39(8):3017-25. doi: 10.1093/nar/gkq1266. Epub 2010 Dec 22.

An accurate feature-based method for identifying DNA-binding residues on protein surfaces.

Proteins. 2011 Feb;79(2):509-17. doi: 10.1002/prot.22898.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

DNA 结合蛋白结合位点的特性和预测：通过结合残基组成、进化保守性和结构参数来提高准确性。

Characterization and prediction of the binding site in DNA-binding proteins: improvement of accuracy by combining residue composition, evolutionary conservation and structural parameters.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献