Malhotra Sony, Sowdhamini Ramanathan
National Centre for Biological Sciences (TIFR), GKVK Campus, Bellary Road, Bangalore 560 065, India.
BMC Genomics. 2014 Dec 22;15(1):1159. doi: 10.1186/1471-2164-15-1159.
Gene expression is tightly regulated at both transcriptional and post-transcriptional levels. RNA-binding proteins are involved in post-transcriptional gene regulation events. They are involved in a variety of functions such as splicing, alternative splicing, nuclear import and export of mRNA, RNA stability and translation. There are several well-characterized RNA-binding motifs present in a whole genome, such as RNA recognition motif (RRM), KH domain, zinc-fingers etc. In the present study, we have investigated human genome for the presence of RRM-containing gene products starting from RRM domains in the Pfam (Protein family database) repository.
In Pfam, seven families are recorded to contain RRM-containing proteins. We studied these families for their taxonomic representation, sequence features (identity, length, phylogeny) and structural properties (mapping conservation on the structures). We then examined the presence of RRM-containing gene products in Homo sapiens genome and identified 928 RRM-containing gene products. These were studied for their predicted domain architectures, biological processes, involvement in pathways, disease relevance and disorder content. RRM domains were observed to occur multiple times in a single polypeptide. However, there are 56 other co-existing domains involved in different regulatory functions. Further, functional enrichment analysis revealed that RRM-containing gene products are mainly involved in biological functions such as mRNA splicing and its regulation.
Our sequence analysis identified RRM-containing gene products in the human genome and provides insights into their domain architectures and biological functions. Since mRNA splicing and gene regulation are important in the cellular machinery, this analysis provides an early overview of genes that carry out these functions.
基因表达在转录和转录后水平都受到严格调控。RNA结合蛋白参与转录后基因调控事件。它们参与多种功能,如剪接、可变剪接、mRNA的核输入和输出、RNA稳定性及翻译。在整个基因组中存在几种特征明确的RNA结合基序,如RNA识别基序(RRM)、KH结构域、锌指等。在本研究中,我们从Pfam(蛋白质家族数据库)储存库中的RRM结构域开始,对人类基因组中含RRM的基因产物进行了研究。
在Pfam中,记录有7个家族包含含RRM的蛋白质。我们研究了这些家族的分类代表性、序列特征(同一性、长度、系统发育)和结构特性(在结构上绘制保守性)。然后,我们检查了智人基因组中含RRM的基因产物的存在情况,并鉴定出928个含RRM的基因产物。对这些产物的预测结构域结构、生物学过程、参与的途径、疾病相关性和紊乱内容进行了研究。观察到RRM结构域在单个多肽中多次出现。然而,还有56个其他共存结构域参与不同的调控功能。此外,功能富集分析表明,含RRM的基因产物主要参与mRNA剪接及其调控等生物学功能。
我们的序列分析在人类基因组中鉴定出含RRM的基因产物,并提供了对其结构域结构和生物学功能的见解。由于mRNA剪接和基因调控在细胞机制中很重要,该分析提供了对执行这些功能的基因的早期概述。