Suppr超能文献

从直接RNA测序数据中检测和定量5-甲氧基尿苷(5moU)RNA修饰

Detection and Quantification of 5moU RNA Modification from Direct RNA Sequencing Data.

作者信息

Li Jiayi, Sun Feiyang, He Kunyang, Zhang Lin, Meng Jia, Huang Daiyun, Zhang Yuxin

机构信息

Wisdom Lake Academy of Pharmacy, Xi'an Jiaotong-Liverpool University, Suzhou, 215123, China.

Department of Computer Science, Xi'an Jiaotong-Liverpool University, Suzhou, 215123, China.

出版信息

Curr Genomics. 2024 May 31;25(3):212-225. doi: 10.2174/0113892029288843240402042529. Epub 2024 Apr 16.

Abstract

BACKGROUND

Chemically modified therapeutic mRNAs have gained momentum recently. In addition to commonly used modifications (., pseudouridine), 5moU is considered a promising substitution for uridine in therapeutic mRNAs. Accurate identification of 5-methoxyuridine (5moU) would be crucial for the study and quality control of relevant -transcribed (IVT) mRNAs. However, current methods exhibit deficiencies in providing quantitative methodologies for detecting such modification. Utilizing the capabilities of Oxford nanopore direct RNA sequencing, in this study, we present NanoML-5moU, a machine-learning framework designed specifically for the read-level detection and quantification of 5moU modification for IVT data.

MATERIALS AND METHODS

Nanopore direct RNA sequencing data from both 5moU-modified and unmodified control samples were collected. Subsequently, a comprehensive analysis and modeling of signal event characteristics (mean, median current intensities, standard deviations, and dwell times) were performed. Furthermore, classical machine learning algorithms, notably the Support Vector Machine (SVM), Random Forest (RF), and XGBoost were employed to discern 5moU modifications within NNUNN (where N represents A, C, U, or G) 5-mers.

RESULTS

Notably, the signal event attributes pertaining to each constituent base of the NNUNN 5-mers, in conjunction with the utilization of the XGBoost algorithm, exhibited remarkable performance levels (with a maximum AUROC of 0.9567 in the "AGTTC" reference 5-mer dataset and a minimum AUROC of 0.8113 in the "TGTGC" reference 5-mer dataset). This accomplishment markedly exceeded the efficacy of the prevailing background error comparison model (ELIGOs AUC 0.751 for site-level prediction). The model's performance was further validated through a series of curated datasets, which featured customized modification ratios designed to emulate broader data patterns, demonstrating its general applicability in quality control of IVT mRNA vaccines. The NanoML-5moU framework is publicly available on GitHub (https://github.com/JiayiLi21/NanoML-5moU).

CONCLUSION

NanoML-5moU enables accurate read-level profiling of 5moU modification with nanopore direct RNA-sequencing, which is a powerful tool specialized in unveiling signal patterns in -transcribed (IVT) mRNAs.

摘要

背景

化学修饰的治疗性mRNA近来发展迅速。除了常用的修饰(如假尿苷)外,5-甲氧基尿苷(5moU)被认为是治疗性mRNA中尿苷的一种有前景的替代物。准确识别5-甲氧基尿苷(5moU)对于相关体外转录(IVT)mRNA的研究和质量控制至关重要。然而,目前的方法在提供检测这种修饰的定量方法方面存在不足。利用牛津纳米孔直接RNA测序的能力,在本研究中,我们提出了NanoML-5moU,这是一个专门为IVT数据中5moU修饰的读段水平检测和定量而设计的机器学习框架。

材料和方法

收集了来自5moU修饰和未修饰对照样品的纳米孔直接RNA测序数据。随后,对信号事件特征(平均、中值电流强度、标准差和驻留时间)进行了全面分析和建模。此外,采用经典机器学习算法,特别是支持向量机(SVM)、随机森林(RF)和XGBoost,来识别NNUNN(其中N代表A、C、U或G)五聚体中的5moU修饰。

结果

值得注意的是,与NNUNN五聚体的每个组成碱基相关的信号事件属性,结合XGBoost算法的使用,表现出显著的性能水平(在“AGTTC”参考五聚体数据集中最大AUC为0.9567,在“TGTGC”参考五聚体数据集中最小AUC为0.8113)。这一成果明显超过了现行背景误差比较模型的效果(ELIGOs位点水平预测的AUC为0.751)。通过一系列精心策划的数据集进一步验证了该模型的性能,这些数据集具有定制的修饰率,旨在模拟更广泛的数据模式,证明了其在IVT mRNA疫苗质量控制中的普遍适用性。NanoML-5moU框架可在GitHub(https://github.com/JiayiLi21/NanoML-5moU)上公开获取。

结论

NanoML-5moU能够通过纳米孔直接RNA测序对5moU修饰进行准确的读段水平分析,这是一个专门用于揭示体外转录(IVT)mRNA中信号模式的强大工具。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96aa/11288159/74690a124213/CG-25-212_F1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验