Heiss Jonathan A, Breitling Lutz P, Lehne Benjamin, Kooner Jaspal S, Chambers John C, Brenner Hermann
Division of Clinical Epidemiology & Aging Research, German Cancer Research Center (DKFZ), Heidelberg, Germany.
Pneumology & Respiratory Critical Care Medicine, Thorax Clinic, University of Heidelberg, Heidelberg, Germany.
Epigenomics. 2017 Jan;9(1):13-20. doi: 10.2217/epi-2016-0091. Epub 2016 Nov 25.
Whole-blood DNA methylation depends on the underlying leukocyte composition and confounding hereby is a major concern in epigenome-wide association studies. Cell counts are often missing or may not be feasible. Computational approaches estimate leukocyte composition from DNA methylation based on reference datasets of purified leukocytes. We explored the possibility to train such a model on whole-blood DNA methylation and cell counts without the need for purification.
MATERIALS & METHODS: Using whole-blood DNA methylation and corresponding five-part cell counts from 2445 participants from the London Life Sciences Prospective Population Study, a model was trained on a subset of 175 subjects and evaluated on the remaining.
Correlations between cell counts and estimated cell proportions were high (neutrophils 0.85, eosinophils 0.88, basophils 0.02, lymphocytes 0.84, monocytes 0.55) and estimated proportions explained more variance in whole-blood DNA methylation levels than counts.
Our model provided precise estimates for the common cell types.
全血DNA甲基化取决于潜在的白细胞组成,因此在全基因组关联研究中,混杂因素是一个主要问题。细胞计数常常缺失或不可行。计算方法基于纯化白细胞的参考数据集,从DNA甲基化中估计白细胞组成。我们探索了在无需纯化的情况下,利用全血DNA甲基化和细胞计数来训练此类模型的可能性。
利用来自伦敦生命科学前瞻性人群研究的2445名参与者的全血DNA甲基化和相应的五分类细胞计数,在175名受试者的子集上训练模型,并在其余受试者上进行评估。
细胞计数与估计的细胞比例之间的相关性很高(中性粒细胞0.85,嗜酸性粒细胞0.88,嗜碱性粒细胞0.02,淋巴细胞0.84,单核细胞0.55),并且估计比例比细胞计数能解释更多全血DNA甲基化水平的方差。
我们的模型为常见细胞类型提供了精确估计。