Department of Biomedical Physics, Moscow Institute of Physics and Technology, 141701 Dolgoprudny, Russia.
Federal State Institution «Federal Research Centre «Fundamentals of Biotechnology» of the Russian Academy of Sciences», 119071 Moscow, Russia.
Int J Mol Sci. 2023 May 11;24(10):8591. doi: 10.3390/ijms24108591.
Differential methylation (DM) is actively recruited in different types of fundamental and translational studies. Currently, microarray- and NGS-based approaches for methylation analysis are the most widely used with multiple statistical models designed to extract differential methylation signatures. The benchmarking of DM models is challenging due to the absence of gold standard data. In this study, we analyze an extensive number of publicly available NGS and microarray datasets with divergent and widely utilized statistical models and apply the recently suggested and validated rank-statistic-based approach Hobotnica to evaluate the quality of their results. Overall, microarray-based methods demonstrate more robust and convergent results, while NGS-based models are highly dissimilar. Tests on the simulated NGS data tend to overestimate the quality of the DM methods and therefore are recommended for use with caution. Evaluation of the top 10 DMC and top 100 DMC in addition to the not-subset signature also shows more stable results for microarray data. Summing up, given the observed heterogeneity in NGS methylation data, the evaluation of newly generated methylation signatures is a crucial step in DM analysis. The Hobotnica metric is coordinated with previously developed quality metrics and provides a robust, sensitive, and informative estimation of methods' performance and DM signatures' quality in the absence of gold standard data solving a long-existing problem in DM analysis.
差异甲基化(DM)在基础和转化研究的不同类型中被积极招募。目前,基于微阵列和 NGS 的甲基化分析方法是最广泛使用的,设计了多种统计模型来提取差异甲基化特征。由于缺乏金标准数据,DM 模型的基准测试具有挑战性。在这项研究中,我们分析了大量公开的 NGS 和微阵列数据集,这些数据集使用了不同的、广泛使用的统计模型,并应用了最近提出和验证的基于秩统计的 Hobotnica 方法来评估它们结果的质量。总体而言,基于微阵列的方法表现出更稳健和一致的结果,而基于 NGS 的模型则高度不同。对模拟 NGS 数据的测试往往会高估 DM 方法的质量,因此建议谨慎使用。对 top10DMC 和 top100DMC 以及非子集特征的测试也表明,微阵列数据的结果更加稳定。总之,鉴于 NGS 甲基化数据中的异质性,新生成的甲基化特征的评估是 DM 分析中的一个关键步骤。Hobotnica 度量与先前开发的质量度量相结合,在没有金标准数据的情况下,提供了方法性能和 DM 特征质量的稳健、敏感和信息丰富的估计,解决了 DM 分析中存在已久的问题。