Smith Martin A, Mattick John S
RNA Biology and Plasticity Laboratory, Garvan Institute of Medical Research, 384 Victoria St, Darlinghurst, NSW, 2010, Australia.
St-Vincent's Clinical School, Faculty of Medicine, UNSW Australia, Sydney, NSW, 2052, Australia.
Methods Mol Biol. 2017;1526:65-85. doi: 10.1007/978-1-4939-6613-4_4.
Protein-coding RNAs represent only a small fraction of the transcriptional output in higher eukaryotes. The remaining RNA species encompass a broad range of molecular functions and regulatory roles, a consequence of the structural polyvalence of RNA polymers. Albeit several classes of small noncoding RNAs are relatively well characterized, the accessibility of affordable high-throughput sequencing is generating a wealth of novel, unannotated transcripts, especially long noncoding RNAs (lncRNAs) that are derived from genomic regions that are antisense, intronic, intergenic, and overlapping protein-coding loci. Parsing and characterizing the functions of noncoding RNAs-lncRNAs in particular-is one of the great challenges of modern genome biology. Here we discuss concepts and computational methods for the identification of structural domains in lncRNAs from genomic and transcriptomic data. In the first part, we briefly review how to identify RNA structural motifs in individual lncRNAs. In the second part, we describe how to leverage the evolutionary dynamics of structured RNAs in a computationally efficient screen to detect putative functional lncRNA motifs using comparative genomics.
蛋白质编码RNA仅占高等真核生物转录输出的一小部分。其余的RNA种类涵盖了广泛的分子功能和调控作用,这是RNA聚合物结构多价性的结果。尽管几类小非编码RNA的特征相对明确,但高通量测序技术的普及使得大量新的、未注释的转录本不断涌现,尤其是来自反义、内含子、基因间和重叠蛋白质编码基因座的长链非编码RNA(lncRNA)。解析并确定非编码RNA(尤其是lncRNA)的功能是现代基因组生物学面临的重大挑战之一。在此,我们讨论从基因组和转录组数据中识别lncRNA结构域的概念和计算方法。第一部分,我们简要回顾如何在单个lncRNA中识别RNA结构基序。第二部分,我们描述如何利用结构化RNA的进化动力学,通过计算效率高的筛选,使用比较基因组学来检测假定的功能性lncRNA基序。