Suppr超能文献

MST-m6A:一种基于多尺度Transformer的新型框架,用于在不同细胞环境中准确预测m6A修饰位点。

MST-m6A: A Novel Multi-Scale Transformer-based Framework for Accurate Prediction of m6A Modification Sites Across Diverse Cellular Contexts.

作者信息

Su Qiaosen, Phan Le Thi, Pham Nhat Truong, Wei Leyi, Manavalan Balachandran

机构信息

Department of Integrative Biotechnology, College of Biotechnology and Bioengineering, Sungkyunkwan University, Suwon 16419, Gyeonggi-do, Republic of Korea.

Faculty of Applied Sciences, Macao Polytechnic University, Macau.

出版信息

J Mol Biol. 2025 Mar 15;437(6):168856. doi: 10.1016/j.jmb.2024.168856. Epub 2024 Nov 6.

Abstract

N6-methyladenosine (m6A) modification, a prevalent epigenetic mark in eukaryotic cells, is crucial in regulating gene expression and RNA metabolism. Accurately identifying m6A modification sites is essential for understanding their functions within biological processes and the intricate mechanisms that regulate them. Recent advances in high-throughput sequencing technologies have enabled the generation of extensive datasets characterizing m6A modification sites at single-nucleotide resolution, leading to the development of computational methods for identifying m6A RNA modification sites. However, most current methods focus on specific cell lines, limiting their generalizability and practical application across diverse biological contexts. To address the limitation, we propose MST-m6A, a novel approach for identifying m6A modification sites with higher accuracy across various cell lines and tissues. MST-m6A utilizes a multi-scale transformer-based architecture, employing dual k-mer tokenization to capture rich feature representations and global contextual information from RNA sequences at multiple levels of granularity. These representations are then effectively combined using a channel fusion mechanism and further processed by a convolutional neural network to enhance prediction accuracy. Rigorous validation demonstrates that MST-m6A significantly outperforms conventional machine learning models, deep learning models, and state-of-the-art predictors. We anticipate that the high precision and cross-cell-type adaptability of MST-m6A will provide valuable insights into m6A biology and facilitate advancements in related fields. The proposed approach is available at https://github.com/cbbl-skku-org/MST-m6A/ for prediction and reproducibility purposes.

摘要

N6-甲基腺苷(m6A)修饰是真核细胞中一种普遍存在的表观遗传标记,在调节基因表达和RNA代谢中起着至关重要的作用。准确识别m6A修饰位点对于理解其在生物过程中的功能以及调节它们的复杂机制至关重要。高通量测序技术的最新进展使得能够生成以单核苷酸分辨率表征m6A修饰位点的大量数据集,从而推动了用于识别m6A RNA修饰位点的计算方法的发展。然而,目前大多数方法都集中在特定的细胞系上,限制了它们在不同生物学背景下的通用性和实际应用。为了解决这一局限性,我们提出了MST-m6A,这是一种在各种细胞系和组织中以更高准确性识别m6A修饰位点的新方法。MST-m6A利用基于多尺度变换器的架构,采用双k-mer标记化来从多个粒度级别捕获RNA序列的丰富特征表示和全局上下文信息。然后,这些表示通过通道融合机制有效地组合起来,并由卷积神经网络进一步处理以提高预测准确性。严格的验证表明,MST-m6A明显优于传统的机器学习模型、深度学习模型和最先进的预测器。我们预计,MST-m6A的高精度和跨细胞类型适应性将为m6A生物学提供有价值的见解,并促进相关领域的进展。所提出的方法可在https://github.com/cbbl-skku-org/MST-m6A/上获取,用于预测和可重复性目的。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验