Washietl Stefan, Hofacker Ivo L, Lukasser Melanie, Hüttenhofer Alexander, Stadler Peter F
Institute for Theoretical Chemistry, University of Vienna, Währingerstrasse 17, 1090 Vienna, Austria.
Nat Biotechnol. 2005 Nov;23(11):1383-90. doi: 10.1038/nbt1144.
In contrast to the fairly reliable and complete annotation of the protein coding genes in the human genome, comparable information is lacking for noncoding RNAs (ncRNAs). We present a comparative screen of vertebrate genomes for structural noncoding RNAs, which evaluates conserved genomic DNA sequences for signatures of structural conservation of base-pairing patterns and exceptional thermodynamic stability. We predict more than 30,000 structured RNA elements in the human genome, almost 1,000 of which are conserved across all vertebrates. Roughly a third are found in introns of known genes, a sixth are potential regulatory elements in untranslated regions of protein-coding mRNAs and about half are located far away from any known gene. Only a small fraction of these sequences has been described previously. A comparison with recent tiling array data shows that more than 40% of the predicted structured RNAs overlap with experimentally detected sites of transcription. The widespread conservation of secondary structure points to a large number of functional ncRNAs and cis-acting mRNA structures in the human genome.
与人类基因组中蛋白质编码基因相当可靠且完整的注释不同,非编码RNA(ncRNA)缺乏类似的信息。我们对脊椎动物基因组进行了结构非编码RNA的比较筛选,该筛选评估保守的基因组DNA序列,以寻找碱基配对模式的结构保守特征和异常的热力学稳定性。我们预测人类基因组中有超过30000个结构化RNA元件,其中近1000个在所有脊椎动物中都是保守的。大约三分之一位于已知基因的内含子中,六分之一是蛋白质编码mRNA非翻译区的潜在调控元件,约一半位于远离任何已知基因的区域。这些序列中只有一小部分先前已被描述。与最近的平铺阵列数据比较表明,超过40%的预测结构化RNA与实验检测到的转录位点重叠。二级结构的广泛保守表明人类基因组中存在大量功能性ncRNA和顺式作用mRNA结构。