Department of Chemical & Biological Engineering, Northwestern University, Evanston, IL, USA.
Center for Synthetic Biology, Northwestern University, Evanston IL, USA.
Bioinformatics. 2019 Dec 15;35(24):5103-5112. doi: 10.1093/bioinformatics/btz449.
RNA molecules can undergo complex structural dynamics, especially during transcription, which influence their biological functions. Recently developed high-throughput chemical probing experiments that study RNA cotranscriptional folding generate nucleotide-resolution 'reactivities' for each length of a growing nascent RNA that reflect structural dynamics. However, the manual annotation and qualitative interpretation of reactivity across these large datasets can be nuanced, laborious, and difficult for new practitioners. We developed a quantitative and systematic approach to automatically detect RNA folding events from these datasets to reduce human bias/error, standardize event discovery and generate hypotheses about RNA folding trajectories for further analysis and experimental validation.
Detection of Unknown Events with Tunable Thresholds (DUETT) identifies RNA structural transitions in cotranscriptional RNA chemical probing datasets. DUETT employs a feedback control-inspired method and a linear regression approach and relies on interpretable and independently tunable parameter thresholds to match qualitative user expectations with quantitatively identified folding events. We validate the approach by identifying known RNA structural transitions within the cotranscriptional folding pathways of the Escherichia coli signal recognition particle RNA and the Bacillus cereus crcB fluoride riboswitch. We identify previously overlooked features of these datasets such as heightened reactivity patterns in the signal recognition particle RNA about 12 nt lengths before base-pair rearrangement. We then apply a sensitivity analysis to identify tradeoffs when choosing parameter thresholds. Finally, we show that DUETT is tunable across a wide range of contexts, enabling flexible application to study broad classes of RNA folding mechanisms.
https://github.com/BagheriLab/DUETT.
Supplementary data are available at Bioinformatics online.
RNA 分子可以经历复杂的结构动力学,尤其是在转录过程中,这会影响它们的生物功能。最近开发的高通量化学探测实验研究 RNA 共转录折叠,为每个生长中的新生 RNA 的长度生成核苷酸分辨率的“反应性”,反映结构动力学。然而,在这些大型数据集上手动注释和定性解释反应性可能很微妙、费力,并且对于新的从业者来说也很困难。我们开发了一种定量且系统的方法,从这些数据集中自动检测 RNA 折叠事件,以减少人为偏见/错误、标准化事件发现并生成有关 RNA 折叠轨迹的假设,以进行进一步分析和实验验证。
具有可调阈值的未知事件检测 (DUETT) 可识别共转录 RNA 化学探测数据集中的 RNA 结构转变。DUETT 采用基于反馈控制的方法和线性回归方法,依赖于可解释且独立可调的参数阈值,将定性用户期望与定量识别的折叠事件相匹配。我们通过识别大肠杆菌信号识别粒子 RNA 和芽孢杆菌 cereus crcB 氟化物核糖开关的共转录折叠途径中的已知 RNA 结构转变来验证该方法。我们确定了这些数据集以前被忽视的特征,例如在碱基对重排之前大约 12 个核苷酸长度的信号识别粒子 RNA 中反应性模式增强。然后,我们进行了敏感性分析以确定选择参数阈值时的权衡。最后,我们表明 DUETT 可在广泛的上下文中进行调整,从而能够灵活地应用于研究广泛的 RNA 折叠机制类别。
https://github.com/BagheriLab/DUETT。
补充数据可在 Bioinformatics 在线获得。