Ohler Uwe, Yekta Soraya, Lim Lee P, Bartel David P, Burge Christopher B
Department of Biology, Massachusetts Institute of Technology, Cambridge 02142, USA.
RNA. 2004 Sep;10(9):1309-22. doi: 10.1261/rna.5206304.
MicroRNAs are approximately 22-nucleotide (nt) RNAs processed from foldback segments of endogenous transcripts. Some are known to play important gene regulatory roles during animal and plant development by pairing to the messages of protein-coding genes to direct the post-transcriptional repression of these messages. Previously, we developed a computational method called MiRscan, which scores features related to the foldbacks, and used this algorithm to identify new miRNA genes in the nematode Caenorhabditis elegans. In the present study, to identify sequences that might be involved in processing or transcriptional regulation of miRNAs, we aligned sequences upstream and downstream of orthologous nematode miRNA foldbacks. These alignments showed a pronounced peak in sequence conservation about 200 bp upstream of the miRNA foldback and revealed a highly significant sequence motif, with consensus CTCCGCCC, that is present upstream of almost all independently transcribed nematode miRNA genes. Scoring the pattern of upstream/downstream conservation, the occurrence of this sequence motif, and orthology of host genes for intronic miRNA candidates, yielded substantial improvements in the accuracy of MiRscan. Nine new C. elegans miRNA gene candidates were validated using a PCR-sequencing protocol. As previously seen for bacterial RNA genes, sequence features outside of the RNA secondary structure can therefore be very useful for the computational identification of eukaryotic noncoding RNA genes. The total number of confidently identified nematode miRNAs now approaches 100. The improved analysis supports our previous assertion that miRNA gene identification is nearing completion in C. elegans with apparently no more than 20 miRNA genes now remaining to be identified.
微小RNA是由内源性转录本的回折片段加工而成的约22个核苷酸(nt)的RNA。已知其中一些在动植物发育过程中发挥重要的基因调控作用,它们通过与蛋白质编码基因的信使配对,指导这些信使的转录后抑制。此前,我们开发了一种名为MiRscan的计算方法,该方法对与回折相关的特征进行评分,并使用此算法在秀丽隐杆线虫中鉴定新的微小RNA基因。在本研究中,为了鉴定可能参与微小RNA加工或转录调控的序列,我们比对了直系同源线虫微小RNA回折上下游的序列。这些比对显示,在微小RNA回折上游约200 bp处,序列保守性出现了一个明显的峰值,并揭示了一个高度显著的序列基序,其共有序列为CTCCGCCC,几乎存在于所有独立转录的线虫微小RNA基因的上游。对上下游保守模式、该序列基序的出现情况以及内含子微小RNA候选基因宿主基因的直系同源性进行评分,显著提高了MiRscan的准确性。使用PCR测序方案验证了9个新的秀丽隐杆线虫微小RNA基因候选物。因此,正如之前在细菌RNA基因中所见,RNA二级结构之外的序列特征对于真核非编码RNA基因的计算鉴定可能非常有用。目前,可靠鉴定出的线虫微小RNA总数已接近100个。改进后的分析支持了我们之前的论断,即秀丽隐杆线虫中微小RNA基因的鉴定已接近完成,目前显然只剩下不超过20个微小RNA基因有待鉴定。