Suppr超能文献

从序列和结构数据库中对DNA结合蛋白结构域家族进行整理与分析。

Collation and analyses of DNA-binding protein domain families from sequence and structural databanks.

作者信息

Malhotra Sony, Sowdhamini Ramanathan

机构信息

National Centre for Biological Sciences, Bellary Road, GKVK Campus, Bangalore, India.

出版信息

Mol Biosyst. 2015 Apr;11(4):1110-8. doi: 10.1039/c4mb00629a.

Abstract

DNA-protein interactions govern several high fidelity cellular processes like DNA-replication, transcription, DNA repair, etc. Proteins that have the ability to recognise and bind DNA sequences can be classified either according to their DNA-binding motif or based on the sequence of the target nucleotides. We have collated the DNA-binding families by integrating information from both protein sequence family and structural databases. This resulted in a dataset of 1057 DNA-binding protein domain families. Their family properties (the number of members, percent identity distribution and length of members) and domain architectures were examined. Further, sequence domain families were mapped to structures in the protein databank (PDB) and the protein domain structure classification database (SCOP). The DNA-binding families, with no structural information, were clustered together into potential superfamilies based on sequence associations. On the basis of functions attributed to DNA-binding protein folds, we observe that a majority of the DNA-binding proteins follow divergent evolution. This study can serve as a basis for annotation and distribution of DNA-binding proteins in genome(s) of interest. The entire collated set of DNA-binding protein domains is available for download as Hidden Markov Models.

摘要

DNA与蛋白质的相互作用调控着多种高保真度的细胞过程,如DNA复制、转录、DNA修复等。能够识别并结合DNA序列的蛋白质可根据其DNA结合基序或目标核苷酸序列进行分类。我们通过整合来自蛋白质序列家族和结构数据库的信息,整理出了DNA结合家族。这产生了一个包含1057个DNA结合蛋白结构域家族的数据集。我们研究了它们的家族特性(成员数量、序列一致性百分比分布以及成员长度)和结构域架构。此外,我们将序列结构域家族映射到蛋白质数据库(PDB)和蛋白质结构域结构分类数据库(SCOP)中的结构上。对于没有结构信息的DNA结合家族,我们根据序列关联将它们聚类为潜在的超家族。基于赋予DNA结合蛋白折叠的功能,我们观察到大多数DNA结合蛋白遵循趋异进化。这项研究可为感兴趣的基因组中DNA结合蛋白的注释和分布提供基础。整理后的整套DNA结合蛋白结构域可作为隐马尔可夫模型供下载。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验