Center for non-coding RNA in Technology and Health (RTH), University of Copenhagen, DK-1870 Frederiksberg, Denmark.
Department of Veterinary and Animal Sciences, Faculty of Health and Medical Sciences, University of Copenhagen, DK-1870 Frederiksberg, Denmark.
Genome Res. 2017 Aug;27(8):1371-1383. doi: 10.1101/gr.208652.116. Epub 2017 May 9.
Structured elements of RNA molecules are essential in, e.g., RNA stabilization, localization, and protein interaction, and their conservation across species suggests a common functional role. We computationally screened vertebrate genomes for conserved RNA structures (CRSs), leveraging structure-based, rather than sequence-based, alignments. After careful correction for sequence identity and GC content, we predict ∼516,000 human genomic regions containing CRSs. We find that a substantial fraction of human-mouse CRS regions (1) colocalize consistently with binding sites of the same RNA binding proteins (RBPs) or (2) are transcribed in corresponding tissues. Additionally, a CaptureSeq experiment revealed expression of many of our CRS regions in human fetal brain, including 662 novel ones. For selected human and mouse candidate pairs, qRT-PCR and in vitro RNA structure probing supported both shared expression and shared structure despite low abundance and low sequence identity. About 30,000 CRS regions are located near coding or long noncoding RNA genes or within enhancers. Structured (CRS overlapping) enhancer RNAs and extended 3' ends have significantly increased expression levels over their nonstructured counterparts. Our findings of transcribed uncharacterized regulatory regions that contain CRSs support their RNA-mediated functionality.
RNA 分子的结构元件对于 RNA 的稳定、定位和蛋白质相互作用等方面非常重要,并且它们在物种间的保守性表明它们具有共同的功能作用。我们通过基于结构的比对,而不是基于序列的比对,计算筛选了脊椎动物基因组中的保守 RNA 结构 (CRS)。在仔细校正序列同一性和 GC 含量后,我们预测了约 516,000 个人类基因组区域包含 CRS。我们发现,相当一部分人类-小鼠 CRS 区域 (1) 与相同 RNA 结合蛋白 (RBP) 的结合位点一致共定位,或 (2) 在相应的组织中转录。此外,一项 CaptureSeq 实验揭示了我们的许多 CRS 区域在人类胎脑中的表达,包括 662 个新的 CRS 区域。对于选定的人类和小鼠候选对,qRT-PCR 和体外 RNA 结构探测支持了它们的共表达和共结构,尽管它们的丰度低且序列同一性低。大约 30,000 个 CRS 区域位于编码或长非编码 RNA 基因附近或增强子内。与非结构的增强子 RNA 和扩展的 3' 末端相比,具有结构的 (CRS 重叠) 增强子 RNA 和扩展的 3' 末端的表达水平显著增加。我们发现转录的未表征的调控区域包含 CRS,支持它们的 RNA 介导的功能。