Department of Bioengineering, Northeastern University, Boston, MA, USA.
Department of Biochemistry and Molecular Biology, Thomas Jefferson University, Philadelphia, PA, USA.
Sci Rep. 2024 Sep 28;14(1):22457. doi: 10.1038/s41598-024-72994-9.
Chemical modifications to mRNA respond dynamically to environmental cues and are important modulators of gene expression. Nanopore direct RNA sequencing has been applied for assessing the presence of pseudouridine (ψ) modifications through basecalling errors and signal analysis. These approaches strongly depend on the sequence context around the modification, and the occupancies derived from these measurements are not quantitative. In this work, we combine direct RNA sequencing of synthetic RNAs bearing site-specific modifications and supervised machine learning models (ModQuant) to achieve near-analytical, site-specific ψ quantification. Our models demonstrate that the ionic current signal features important for accurate ψ classification are sequence dependent and encompass information extending beyond n + 2 and n - 2 nucleotides from the ψ site. This is contradictory to current models, which assume that accurate ψ classification can be achieved with signal information confined to the 5-nucleotide k-mer window (n + 2 and n - 2 nucleotides from the ψ site). We applied our models to quantitatively profile ψ occupancy in five mRNA sites in datasets from seven human cell lines, demonstrating conserved and variable sites. Our study motivates a wider pipeline that uses ground-truth RNA control sets with site-specific modifications for quantitative profiling of RNA modifications. The ModQuant pipeline and guide are freely available at https://github.com/wanunulab/ModQuant .
mRNA 的化学修饰会对环境线索做出动态响应,是基因表达的重要调控因子。纳米孔直接 RNA 测序已被应用于通过碱基调用错误和信号分析来评估假尿嘧啶(ψ)修饰的存在。这些方法强烈依赖于修饰周围的序列上下文,并且从这些测量中得出的占有率不是定量的。在这项工作中,我们将携带特定位置修饰的合成 RNA 的直接 RNA 测序与监督机器学习模型(ModQuant)相结合,实现了近乎分析性的、特定位置 ψ 定量。我们的模型表明,对于准确的 ψ 分类很重要的离子电流信号特征是序列依赖性的,并且包含了超出 ψ 位点的 n + 2 和 n - 2 个核苷酸的信息。这与当前的模型相矛盾,当前的模型假设仅使用来自 ψ 位点的 5 核苷酸 k-mer 窗口(n + 2 和 n - 2 个核苷酸)内的信号信息就可以实现准确的 ψ 分类。我们将我们的模型应用于来自七个人类细胞系的数据集的五个 mRNA 位点的 ψ 占有率的定量分析,证明了保守和可变的位点。我们的研究激发了一个更广泛的管道,该管道使用具有特定位置修饰的真实 RNA 对照集来定量分析 RNA 修饰。ModQuant 管道和指南可在 https://github.com/wanunulab/ModQuant 上免费获得。