Fu Haitao, Ding Zewen, Wang Wen
School of Artificial Intelligence, Hubei University, Wuhan, 430062, China.
University of Edinburgh, Centre for Discovery Brain Sciences, Edinburgh, EH89XD, United Kingdom.
Methods. 2025 Feb;234:178-186. doi: 10.1016/j.ymeth.2024.12.010. Epub 2024 Dec 30.
5-Methylcytosine (m5C) plays a pivotal role in various RNA metabolic processes, including RNA localization, stability, and translation. Current high-throughput sequencing technologies for m5C site identification are resource-intensive in terms of cost, labor, and time. As such, there is a pressing need for efficient computational approaches. Many existing computational methods rely on intricate hand-crafted features, requiring unavailable features, often leading to suboptimal prediction accuracy. Addressing these challenges, we introduce a novel deep-learning method, Trans-m5C. We first categorize m5C sites into NSUN2-dependent and NSUN6-dependent types for independent feature extraction. Subsequently, meticulously crafted transformer neural networks are employed to distill global features. The prediction of m5C sites is then accomplished using a discriminator built from a multi-layer perceptron. A rigorous evaluation for the performance of Trans-m5C on experimentally validated m5C data from human and mouse species reveals that our method offers a competitive edge over both baseline and existing methodologies.
5-甲基胞嘧啶(m5C)在各种RNA代谢过程中起着关键作用,包括RNA定位、稳定性和翻译。目前用于m5C位点识别的高通量测序技术在成本、人力和时间方面资源消耗大。因此,迫切需要高效的计算方法。许多现有的计算方法依赖于复杂的手工特征,需要不可用的特征,常常导致次优的预测准确性。为应对这些挑战,我们引入了一种新颖的深度学习方法Trans-m5C。我们首先将m5C位点分为依赖NSUN2和依赖NSUN6的类型,以便进行独立特征提取。随后,精心构建的Transformer神经网络被用于提取全局特征。然后使用由多层感知器构建的判别器完成m5C位点的预测。对Trans-m5C在来自人类和小鼠物种的经实验验证的m5C数据上的性能进行的严格评估表明,我们的方法比基线方法和现有方法都具有竞争优势。