Department of Biological Sciences, Xi'an Jiaotong-Liverpool University, Suzhou, Jiangsu 215123, China.
Institute of Systems, Molecular and Integrative Biology, University of Liverpool, L7 8TX Liverpool, UK.
Nucleic Acids Res. 2024 May 22;52(9):4830-4842. doi: 10.1093/nar/gkae280.
We present m6ACali, a novel machine-learning framework aimed at enhancing the accuracy of N6-methyladenosine (m6A) epitranscriptome profiling by reducing the impact of non-specific antibody enrichment in MeRIP-Seq. The calibration model serves as a genomic feature-based classifier that refines the identification of m6A sites, distinguishing those genuinely present from those that can be detected in in-vitro transcribed (IVT) control experiments. We find that m6ACali effectively identifies non-specific binding peaks reported by exomePeak2 and MACS2 in novel MeRIP-Seq datasets without the need for paired IVT controls. The model interpretation revealed that off-target antibody binding sites commonly occur at short exons and short mRNAs, originating from high read coverage regions that share the motif sequence with true m6A sites. We also reveal that the ML strategy can efficiently adjust differentially methylated peaks and other antibody-dependent, base-resolution m6A detection techniques. As a result, m6ACali offers a promising method for the universal enhancement of m6A profiles generated by MeRIP-Seq experiments, elevating the benchmark for omics-level m6A data integration.
我们提出了 m6ACali,这是一种新的机器学习框架,旨在通过减少 MeRIP-Seq 中非特异性抗体富集的影响来提高 N6-甲基腺苷(m6A)转录组修饰谱分析的准确性。该校准模型作为一种基于基因组特征的分类器,可对 m6A 位点的识别进行细化,区分那些真正存在的位点和那些可以在体外转录(IVT)对照实验中检测到的位点。我们发现,m6ACali 可以有效地识别 exomePeak2 和 MACS2 在新的 MeRIP-Seq 数据集中报告的非特异性结合峰,而无需进行配对的 IVT 对照。模型解释表明,脱靶抗体结合位点通常发生在短外显子和短 mRNA 上,这些外显子和短 mRNA 来源于具有与真实 m6A 位点序列相同的基序的高读取覆盖区域。我们还揭示了 ML 策略可以有效地调整差异甲基化峰和其他依赖抗体的、碱基分辨率的 m6A 检测技术。因此,m6ACali 为 MeRIP-Seq 实验生成的 m6A 谱的普遍增强提供了一种有前途的方法,提高了组学水平 m6A 数据整合的基准。