Department of Molecular and Cellular Biology, Harvard University, Cambridge, Massachusetts, USA.
Wiley Interdiscip Rev RNA. 2021 Sep;12(5):e1649. doi: 10.1002/wrna.1649. Epub 2021 Mar 22.
An RNA structure prediction from a single-sequence RNA folding program is not evidence for an RNA whose structure is important for function. Random sequences have plausible and complex predicted structures not easily distinguishable from those of structural RNAs. How to tell when an RNA has a conserved structure is a question that requires looking at the evolutionary signature left by the conserved RNA. This question is important not just for long noncoding RNAs which usually lack an identified function, but also for RNA binding protein motifs which can be single stranded RNAs or structures. Here we review recent advances using sequence and structural analysis to determine when RNA structure is conserved or not. Although covariation measures assess structural RNA conservation, one must distinguish covariation due to RNA structure from covariation due to independent phylogenetic substitutions. We review a statistical test to measure false positives expected under the null hypothesis of phylogenetic covariation alone (specificity). We also review a complementary test that measures power, that is, expected covariation derived from sequence variation alone (sensitivity). Power in the absence of covariation signals the absence of a conserved RNA structure. We analyze artifacts that falsely identify conserved RNA structure such as the misuse of programs that do not assess significance, the use of inappropriate statistics confounded by signals other than covariation, or misalignments that induce spurious covariation. Among artifacts that obscure the signal of a conserved RNA structure, we discuss the inclusion of pseudogenes in alignments which increase power but destroy covariation. This article is categorized under: RNA Structure and Dynamics > RNA Structure, Dynamics and Chemistry RNA Evolution and Genomics > Computational Analyses of RNA RNA Evolution and Genomics > RNA and Ribonucleoprotein Evolution.
从单序列 RNA 折叠程序中预测 RNA 结构并不能证明该 RNA 的结构对于功能很重要。随机序列具有合理且复杂的预测结构,这些结构与结构 RNA 的结构难以区分。如何判断 RNA 是否具有保守结构是一个需要考虑保守 RNA 留下的进化特征的问题。这个问题不仅对通常缺乏确定功能的长非编码 RNA 很重要,而且对 RNA 结合蛋白基序也很重要,这些基序可以是单链 RNA 或结构。在这里,我们回顾了使用序列和结构分析来确定 RNA 结构是否保守的最新进展。尽管共变测量评估了结构 RNA 的保守性,但必须区分由于 RNA 结构导致的共变和由于独立的系统发育替换导致的共变。我们回顾了一种统计检验方法,用于测量在仅存在系统发育共变的零假设下预期的假阳性(特异性)。我们还回顾了一种互补检验方法,用于测量功效,即仅从序列变异中推导出来的预期共变(敏感性)。在不存在共变信号的情况下,功效表明不存在保守的 RNA 结构。我们分析了错误识别保守 RNA 结构的假象,例如误用不评估显著性的程序、使用受到除共变以外的信号混淆的不适当统计数据,或因错配而产生的虚假共变。在掩盖保守 RNA 结构信号的假象中,我们讨论了在比对中包含假基因,这会增加功效但破坏共变。本文属于以下类别:RNA 结构与动力学 > RNA 结构、动态与化学 RNA 进化与基因组学 > RNA 的计算分析 RNA 进化与基因组学 > RNA 和核糖核蛋白的进化。