利用结构、静电和进化特征鉴定DNA结合蛋白。

Identification of DNA-binding proteins using structural, electrostatic and evolutionary features.

作者信息

Nimrod Guy, Szilágyi András, Leslie Christina, Ben-Tal Nir

机构信息

Department of Biochemistry, The George S. Wise Faculty of Life Sciences, Tel Aviv University, Ramat Aviv 69978, Israel.

出版信息

J Mol Biol. 2009 Apr 10;387(4):1040-53. doi: 10.1016/j.jmb.2009.02.023. Epub 2009 Feb 20.

DOI:10.1016/j.jmb.2009.02.023

PMID:19233205

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2726711/

Abstract

DNA-binding proteins (DBPs) participate in various crucial processes in the life-cycle of the cells, and the identification and characterization of these proteins is of great importance. We present here a random forests classifier for identifying DBPs among proteins with known 3D structures. First, clusters of evolutionarily conserved regions (patches) on the surface of proteins were detected using the PatchFinder algorithm; earlier studies showed that these regions are typically the functionally important regions of proteins. Next, we trained a classifier using features like the electrostatic potential, cluster-based amino acid conservation patterns and the secondary structure content of the patches, as well as features of the whole protein, including its dipole moment. Using 10-fold cross-validation on a dataset of 138 DBPs and 110 proteins that do not bind DNA, the classifier achieved a sensitivity and a specificity of 0.90, which is overall better than the performance of published methods. Furthermore, when we tested five different methods on 11 new DBPs that did not appear in the original dataset, only our method annotated all correctly. The resulting classifier was applied to a collection of 757 proteins of known structure and unknown function. Of these proteins, 218 were predicted to bind DNA, and we anticipate that some of them interact with DNA using new structural motifs. The use of complementary computational tools supports the notion that at least some of them do bind DNA.

摘要

DNA结合蛋白（DBP）参与细胞生命周期中的各种关键过程，对这些蛋白的鉴定和表征具有重要意义。我们在此提出一种随机森林分类器，用于在具有已知三维结构的蛋白质中识别DBP。首先，使用PatchFinder算法检测蛋白质表面进化保守区域（补丁）的簇；早期研究表明，这些区域通常是蛋白质的功能重要区域。接下来，我们使用诸如静电势、基于簇的氨基酸保守模式和补丁的二级结构内容等特征，以及整个蛋白质的特征（包括其偶极矩）来训练分类器。在一个由138个DBP和110个不结合DNA的蛋白质组成的数据集上进行10倍交叉验证时，该分类器的灵敏度和特异性达到了0.90，总体上优于已发表方法的性能。此外，当我们在原始数据集中未出现的11个新DBP上测试五种不同方法时，只有我们的方法全部正确注释。所得分类器应用于一组757个已知结构但功能未知的蛋白质。在这些蛋白质中，预测有218个会结合DNA，我们预计其中一些会使用新的结构基序与DNA相互作用。使用互补的计算工具支持了至少其中一些确实结合DNA的观点。

相似文献

Identification of DNA-binding proteins using structural, electrostatic and evolutionary features.

J Mol Biol. 2009 Apr 10;387(4):1040-53. doi: 10.1016/j.jmb.2009.02.023. Epub 2009 Feb 20.

iDBPs: a web server for the identification of DNA binding proteins.

Bioinformatics. 2010 Mar 1;26(5):692-3. doi: 10.1093/bioinformatics/btq019. Epub 2010 Jan 19.

Identification and physicochemical characterization of BldR2 from Sulfolobus solfataricus, a novel archaeal member of the MarR transcription factor family.

Biochemistry. 2011 Aug 9;50(31):6607-21. doi: 10.1021/bi200187j. Epub 2011 Jul 14.

The structural basis of DNA binding by the single-stranded DNA-binding protein from Sulfolobus solfataricus.

Biochem J. 2015 Jan 15;465(2):337-46. doi: 10.1042/BJ20141140.

Biochemical and structural characterization of Cren7, a novel chromatin protein conserved among Crenarchaea.

Nucleic Acids Res. 2008 Mar;36(4):1129-37. doi: 10.1093/nar/gkm1128. Epub 2007 Dec 20.

Predicting DNA-binding amino acid residues from electrostatic stabilization upon mutation to Asp/Glu and evolutionary conservation.

Proteins. 2007 May 15;67(3):671-80. doi: 10.1002/prot.21366.

Structural and functional analyses of five conserved positively charged residues in the L1 and N-terminal DNA binding motifs of archaeal RADA protein.

PLoS One. 2007 Sep 12;2(9):e858. doi: 10.1371/journal.pone.0000858.

In silico identification of functional regions in proteins.

Bioinformatics. 2005 Jun;21 Suppl 1:i328-37. doi: 10.1093/bioinformatics/bti1023.

Crystal structure of an archaeal Sm protein from Sulfolobus solfataricus.

Proteins. 2005 Nov 15;61(3):689-93. doi: 10.1002/prot.20637.

The Arginine Pairs and C-Termini of the Sso7c4 from Sulfolobus solfataricus Participate in Binding and Bending DNA.

PLoS One. 2017 Jan 9;12(1):e0169627. doi: 10.1371/journal.pone.0169627. eCollection 2017.

引用本文的文献

StackDPP: a stacking ensemble based DNA-binding protein prediction model.

BMC Bioinformatics. 2024 Mar 14;25(1):111. doi: 10.1186/s12859-024-05714-9.

Predicting Hot Spot Residues at Protein-DNA Binding Interfaces Based on Sequence Information.

Interdiscip Sci. 2021 Mar;13(1):1-11. doi: 10.1007/s12539-020-00399-z. Epub 2020 Oct 17.

PredDBP-Stack: Prediction of DNA-Binding Proteins from HMM Profiles using a Stacked Ensemble Method.

Biomed Res Int. 2020 Apr 13;2020:7297631. doi: 10.1155/2020/7297631. eCollection 2020.

The Road Not Taken with Pyrrole-Imidazole Polyamides: Off-Target Effects and Genomic Binding.

Biomolecules. 2020 Apr 3;10(4):544. doi: 10.3390/biom10040544.

A random forest based computational model for predicting novel lncRNA-disease associations.

BMC Bioinformatics. 2020 Mar 27;21(1):126. doi: 10.1186/s12859-020-3458-1.

PredPSD: A Gradient Tree Boosting Approach for Single-Stranded and Double-Stranded DNA Binding Protein Prediction.

Molecules. 2019 Dec 26;25(1):98. doi: 10.3390/molecules25010098.

Speeding up the drug discovery process: structural similarity searches using molecular surfaces.

EMBnet J. 2012;18(1):6-9. doi: 10.14806/ej.18.1.501.

DNA-protein interaction: identification, prediction and data analysis.

Mol Biol Rep. 2019 Jun;46(3):3571-3596. doi: 10.1007/s11033-019-04763-1. Epub 2019 Mar 26.

Prediction of RNA- and DNA-Binding Proteins Using Various Machine Learning Classifiers.

Avicenna J Med Biotechnol. 2019 Jan-Mar;11(1):104-111.

Classification and interaction in random forests.

Proc Natl Acad Sci U S A. 2018 Feb 20;115(8):1690-1692. doi: 10.1073/pnas.1800256115. Epub 2018 Feb 12.

本文引用的文献

All-atom empirical potential for molecular modeling and dynamics studies of proteins.

J Phys Chem B. 1998 Apr 30;102(18):3586-616. doi: 10.1021/jp973084f.

Detection of functionally important regions in "hypothetical proteins" of known structure.

Structure. 2008 Dec 10;16(12):1755-63. doi: 10.1016/j.str.2008.10.017.

Classifying RNA-binding proteins based on electrostatic properties.

PLoS Comput Biol. 2008 Aug 8;4(8):e1000146. doi: 10.1371/journal.pcbi.1000146.

Replication origin recognition and deformation by a heterodimeric archaeal Orc1 complex.

Science. 2007 Aug 31;317(5842):1210-3. doi: 10.1126/science.1143690.

Prediction of DNA-binding residues from sequence.

Bioinformatics. 2007 Jul 1;23(13):i347-53. doi: 10.1093/bioinformatics/btm174.

Structural insight into repair of alkylated DNA by a new superfamily of DNA glycosylases comprising HEAT-like repeats.

Nucleic Acids Res. 2007;35(7):2451-9. doi: 10.1093/nar/gkm039. Epub 2007 Mar 29.

New developments in the InterPro database.

Nucleic Acids Res. 2007 Jan;35(Database issue):D224-8. doi: 10.1093/nar/gkl841.

Tools for integrated sequence-structure analysis with UCSF Chimera.

BMC Bioinformatics. 2006 Jul 12;7:339. doi: 10.1186/1471-2105-7-339.

Automated protein function prediction--the genomic challenge.

Brief Bioinform. 2006 Sep;7(3):225-42. doi: 10.1093/bib/bbl004. Epub 2006 May 23.

Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences.

Bioinformatics. 2006 Jul 1;22(13):1658-9. doi: 10.1093/bioinformatics/btl158. Epub 2006 May 26.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

利用结构、静电和进化特征鉴定DNA结合蛋白。

Identification of DNA-binding proteins using structural, electrostatic and evolutionary features.

作者信息

Nimrod Guy, Szilágyi András, Leslie Christina, Ben-Tal Nir

机构信息

Department of Biochemistry, The George S. Wise Faculty of Life Sciences, Tel Aviv University, Ramat Aviv 69978, Israel.

出版信息

J Mol Biol. 2009 Apr 10;387(4):1040-53. doi: 10.1016/j.jmb.2009.02.023. Epub 2009 Feb 20.

DOI:10.1016/j.jmb.2009.02.023

PMID:19233205

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2726711/

Abstract

摘要

利用结构、静电和进化特征鉴定DNA结合蛋白。

Identification of DNA-binding proteins using structural, electrostatic and evolutionary features.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

利用结构、静电和进化特征鉴定DNA结合蛋白。

Identification of DNA-binding proteins using structural, electrostatic and evolutionary features.

作者信息

机构信息

出版信息