Allan Matthew F, Aruda Justin, Plung Jesse S, Grote Scott L, des Taillades Yves J Martin, de Lajarte Albéric A, Bathe Mark, Rouskin Silvi
Department of Microbiology, Harvard Medical School, Boston, Massachusetts, USA 02115.
Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA 02139.
Res Sq. 2024 Aug 7:rs.3.rs-4814547. doi: 10.21203/rs.3.rs-4814547/v1.
RNA molecules perform a diversity of essential functions for which their linear sequences must fold into higher-order structures. Techniques including crystallography and cryogenic electron microscopy have revealed 3D structures of ribosomal, transfer, and other well-structured RNAs; while chemical probing with sequencing facilitates secondary structure modeling of any RNAs of interest, even within cells. Ongoing efforts continue increasing the accuracy, resolution, and ability to distinguish coexisting alternative structures. However, no method can discover and quantify alternative structures with base pairs spanning arbitrarily long distances - an obstacle for studying viral, messenger, and long noncoding RNAs, which may form long-range base pairs. Here, we introduce the method of Structure Ensemble Ablation by Reverse Complement Hybridization with Mutational Profiling (SEARCH-MaP) and software for Structure Ensemble Inference by Sequencing, Mutation Identification, and Clustering of RNA (SEISMIC-RNA). We use SEARCH-MaP and SEISMIC-RNA to discover that the frameshift stimulating element of SARS coronavirus 2 base-pairs with another element 1 kilobase downstream in nearly half of RNA molecules, and that this structure competes with a pseudoknot that stimulates ribosomal frameshifting. Moreover, we identify long-range base pairs involving the frameshift stimulating element in other coronaviruses including SARS coronavirus 1 and transmissible gastroenteritis virus, and model the full genomic secondary structure of the latter. These findings suggest that long-range base pairs are common in coronaviruses and may regulate ribosomal frameshifting, which is essential for viral RNA synthesis. We anticipate that SEARCH-MaP will enable solving many RNA structure ensembles that have eluded characterization, thereby enhancing our general understanding of RNA structures and their functions. SEISMIC-RNA, software for analyzing mutational profiling data at any scale, could power future studies on RNA structure and is available on GitHub and the Python Package Index.
RNA分子执行多种基本功能,其线性序列必须折叠成更高阶的结构。包括晶体学和低温电子显微镜在内的技术已经揭示了核糖体RNA、转运RNA和其他结构良好的RNA的三维结构;而测序化学探针有助于对任何感兴趣的RNA进行二级结构建模,甚至在细胞内也是如此。正在进行的努力不断提高准确性、分辨率以及区分共存替代结构的能力。然而,没有一种方法能够发现和量化跨越任意长距离的碱基对的替代结构——这是研究病毒RNA、信使RNA和长链非编码RNA的障碍,因为它们可能形成长距离碱基对。在这里,我们介绍了通过反向互补杂交与突变分析进行结构集合消融的方法(SEARCH-MaP)以及用于通过RNA测序、突变鉴定和聚类进行结构集合推断的软件(SEISMIC-RNA)。我们使用SEARCH-MaP和SEISMIC-RNA发现,严重急性呼吸综合征冠状病毒2的移码刺激元件在近一半的RNA分子中与下游1千碱基处的另一个元件形成碱基对,并且这种结构与刺激核糖体移码的假结相互竞争。此外,我们在包括严重急性呼吸综合征冠状病毒1和传染性胃肠炎病毒在内的其他冠状病毒中鉴定出涉及移码刺激元件的长距离碱基对,并对后者的全基因组二级结构进行了建模。这些发现表明长距离碱基对在冠状病毒中很常见,可能调节核糖体移码,而核糖体移码对病毒RNA合成至关重要。我们预计SEARCH-MaP将能够解决许多尚未得到表征的RNA结构集合,从而增进我们对RNA结构及其功能的总体理解。SEISMIC-RNA是一款用于分析任何规模突变分析数据的软件,可为未来的RNA结构研究提供助力,可在GitHub和Python软件包索引上获取。