Institute of Plant and Microbial Biology, Academia Sinica, Taipei, 115, Taiwan.
Bioinformatics Program, Taiwan International Graduate Program, National Taiwan University, Taipei, 115, Taiwan.
Epigenetics Chromatin. 2023 Nov 9;16(1):44. doi: 10.1186/s13072-023-00521-7.
In a heterogeneous population of cells, individual cells can behave differently and respond variably to the environment. This cellular diversity can be assessed by measuring DNA methylation patterns. The loci with variable methylation patterns are informative of cellular heterogeneity and may serve as biomarkers of diseases and developmental progression. Cell-to-cell methylation heterogeneity can be evaluated through single-cell methylomes or computational techniques for pooled cells. However, the feasibility and performance of these approaches to precisely estimate methylation heterogeneity require further assessment.
Here, we proposed model-based methods adopted from a mathematical framework originally from biodiversity, to estimate genome-wide DNA methylation heterogeneity. We evaluated the performance of our models and the existing methods with feature comparison, and tested on both synthetic datasets and real data. Overall, our methods have demonstrated advantages over others because of their better correlation with the actual heterogeneity. We also demonstrated that methylation heterogeneity offers an additional layer of biological information distinct from the conventional methylation level. In the case studies, we showed that distinct profiles of methylation heterogeneity in CG and non-CG methylation can predict the regulatory roles between genomic elements in Arabidopsis. This opens up a new direction for plant epigenomics. Finally, we demonstrated that our score might be able to identify loci in human cancer samples as putative biomarkers for early cancer detection.
We adopted the mathematical framework from biodiversity into three model-based methods for analyzing genome-wide DNA methylation heterogeneity to monitor cellular heterogeneity. Our methods, namely MeH, have been implemented, evaluated with existing methods, and are open to the research community.
在异质细胞群体中,单个细胞的行为可能不同,对环境的反应也可能不同。这种细胞多样性可以通过测量 DNA 甲基化模式来评估。具有可变甲基化模式的基因座提供了细胞异质性的信息,并可能作为疾病和发育进展的生物标志物。可以通过单细胞甲基组或用于混合细胞的计算技术来评估细胞间甲基化异质性。然而,这些方法精确估计甲基化异质性的可行性和性能需要进一步评估。
在这里,我们提出了基于模型的方法,该方法采用了最初来自生物多样性的数学框架,以估计全基因组 DNA 甲基化异质性。我们通过特征比较评估了我们的模型和现有方法的性能,并在合成数据集和真实数据上进行了测试。总体而言,由于与实际异质性的相关性更好,我们的方法具有优于其他方法的优势。我们还表明,甲基化异质性提供了与传统甲基化水平不同的另一个生物学信息层。在案例研究中,我们表明 CG 和非 CG 甲基化中不同的甲基化异质性谱可以预测拟南芥中基因组元件之间的调节作用。这为植物表观基因组学开辟了新的方向。最后,我们表明,我们的分数可能能够识别人类癌症样本中的基因座,作为早期癌症检测的潜在生物标志物。
我们将生物多样性的数学框架应用于三种基于模型的方法中,以分析全基因组 DNA 甲基化异质性,从而监测细胞异质性。我们的方法,即 MeH,已经实现,并与现有方法进行了评估,并向研究界开放。