Zhang Baohong, Stellwag Edmund J, Pan Xiaoping
Department of Biology, East Carolina University, Greenville, NC 27858, USA.
Gene. 2009 Aug 15;443(1-2):100-9. doi: 10.1016/j.gene.2009.04.027. Epub 2009 May 5.
Although great progress has been made in identifying microRNAs (miRNAs) and their functions, their essential functional features remain largely unknown. In this study, we systemically investigated the nucleotide and thermodynamic folding distribution characteristics of 3853 miRNAs currently reported for metazoans. We determined that uracil is the dominant nucleotide in both mature and precursor sequences, and that it is particularly enriched at three sites in mature miRNAs: the first, ninth, and the five terminal 3' nucleotides. The location of these enriched uracil nucleotides is particularly interesting because positions one and nine are the edges of the "seed region", which is responsible for targeting mRNAs for gene regulation. The prevalence of U residues at these sites may contribute to the mechanism whereby miRNAs target and bind to their corresponding mRNAs. A comparison of the overall lengths of metazoan pre-miRNAs revealed that they ranged from 53 to 215 nt in length with an average of 88.10+/-14.14 nt, significantly higher than previously reported. Comparisons of miRNA diversity at different taxonomic levels revealed that the 12 features investigated in this study varied significantly among miRNAs represented by different phyla, with particularly high levels of divergence in platyhelminths relative to nematodes, arthropods or vertebrates. By comparison, lower levels of diversity were observed at lower taxonomic levels such that there was a direct relationship between divergence in miRNA features and taxonomic level. We conclude that large-scale genome analysis shows that miRNAs have many more unique features than previously reported. In particular, the distribution of nucleotides suggests an important role for uracil at the boundaries of the 'seed' region and at their termini. These results will facilitate the design of new computational programs for identifying novel miRNAs and investigating the mechanism of miRNA-mediated gene regulation.
尽管在识别微小RNA(miRNA)及其功能方面已取得了巨大进展,但其基本功能特征在很大程度上仍不为人知。在本研究中,我们系统地研究了目前报道的后生动物的3853种miRNA的核苷酸和热力学折叠分布特征。我们确定尿嘧啶是成熟序列和前体序列中的主要核苷酸,并且在成熟miRNA的三个位点特别富集:第一位、第九位以及五个末端3'核苷酸。这些富集尿嘧啶核苷酸的位置特别有趣,因为第一位和第九位是“种子区域”的边缘,该区域负责靶向mRNA进行基因调控。这些位点上U残基的普遍存在可能有助于miRNA靶向并结合其相应mRNA的机制。后生动物前体miRNA的全长比较显示,它们的长度范围为53至215 nt,平均为88.10±14.14 nt,显著高于先前报道的长度。不同分类水平上miRNA多样性的比较显示,本研究中调查的12个特征在不同门代表的miRNA之间有显著差异,相对于线虫、节肢动物或脊椎动物,扁形动物中的差异水平特别高。相比之下,在较低分类水平上观察到的多样性水平较低,因此miRNA特征的差异与分类水平之间存在直接关系。我们得出结论,大规模基因组分析表明,miRNA具有比先前报道更多的独特特征。特别是,核苷酸的分布表明尿嘧啶在“种子”区域边界及其末端起着重要作用。这些结果将有助于设计新的计算程序来识别新型miRNA并研究miRNA介导的基因调控机制。