Center for Statistical Science, Department of Industrial Engineering, Tsinghua University, Beijing, China.
MOE Key Laboratory of Bioinformatics, School of Life Sciences, Tsinghua University, Beijing, China.
Nat Commun. 2022 Jul 22;13(1):4227. doi: 10.1038/s41467-022-31875-3.
RNAs perform their function by forming specific structures, which can change across cellular conditions. Structure probing experiments combined with next generation sequencing technology have enabled transcriptome-wide analysis of RNA secondary structure in various cellular conditions. Differential analysis of structure probing data in different conditions can reveal the RNA structurally variable regions (SVRs), which is important for understanding RNA functions. Here, we propose DiffScan, a computational framework for normalization and differential analysis of structure probing data in high resolution. DiffScan preprocesses structure probing datasets to remove systematic bias, and then scans the transcripts to identify SVRs and adaptively determines their lengths and locations. The proposed approach is compatible with most structure probing platforms (e.g., icSHAPE, DMS-seq). When evaluated with simulated and benchmark datasets, DiffScan identifies structurally variable regions at nucleotide resolution, with substantial improvement in accuracy compared with existing SVR detection methods. Moreover, the improvement is robust when tested in multiple structure probing platforms. Application of DiffScan in a dataset of multi-subcellular RNA structurome and a subsequent motif enrichment analysis suggest potential links of RNA structural variation and mRNA abundance, possibly mediated by RNA binding proteins such as the serine/arginine rich splicing factors. This work provides an effective tool for differential analysis of RNA secondary structure, reinforcing the power of structure probing experiments in deciphering the dynamic RNA structurome.
RNAs 通过形成特定的结构来发挥其功能,这些结构可以在细胞状态发生变化时发生改变。结构探测实验与下一代测序技术相结合,使人们能够在各种细胞状态下对 RNA 二级结构进行全转录组分析。在不同条件下对结构探测数据进行差异分析,可以揭示 RNA 结构可变区(SVR),这对于理解 RNA 功能非常重要。在这里,我们提出了 DiffScan,这是一种用于高分辨率结构探测数据归一化和差异分析的计算框架。DiffScan 预处理结构探测数据集以消除系统偏差,然后扫描转录本以识别 SVR,并自适应地确定它们的长度和位置。该方法与大多数结构探测平台(例如,icSHAPE、DMS-seq)兼容。在使用模拟和基准数据集进行评估时,DiffScan 可以在核苷酸分辨率上识别结构可变区,与现有的 SVR 检测方法相比,准确性有了实质性的提高。此外,在多个结构探测平台上进行测试时,这种改进是稳健的。将 DiffScan 应用于多亚细胞 RNA 结构组学数据集,并进行随后的基序富集分析,表明 RNA 结构变异与 mRNA 丰度之间可能存在潜在联系,这种联系可能是由 RNA 结合蛋白(如丝氨酸/精氨酸丰富的剪接因子)介导的。这项工作为 RNA 二级结构的差异分析提供了一种有效的工具,增强了结构探测实验在破译动态 RNA 结构组学方面的力量。