Ma Baoshan, Wilker Elissa H, Willis-Owen Saffron A G, Byun Hyang-Min, Wong Kenny C C, Motta Valeria, Baccarelli Andrea A, Schwartz Joel, Cookson William O C M, Khabbaz Kamal, Mittleman Murray A, Moffatt Miriam F, Liang Liming
Department of Epidemiology, Harvard School of Public Health, Boston, MA 02115, USA, College of Information Science and Technology, Dalian Maritime University, Dalian, Liaoning Province 116026, China, Cardiovascular Epidemiology Research Unit, Beth Israel Deaconess Medical Center, Boston, MA 02215, USA, Department of Environmental Health, Harvard School of Public Health, Boston, MA 02115, USA, National Heart and Lung Institute, Imperial College, London SW3 6LY, UK, Department of Clinical Sciences and Community, University of Milan, Milan 20122, Italy, Division of Cardiac Surgery, Department of Surgery, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA 02215, USA and Department of Biostatistics, Harvard School of Public Health, Boston, MA 02115, USA.
Nucleic Acids Res. 2014 Apr;42(6):3515-28. doi: 10.1093/nar/gkt1380. Epub 2014 Jan 20.
Differences in methylation across tissues are critical to cell differentiation and are key to understanding the role of epigenetics in complex diseases. In this investigation, we found that locus-specific methylation differences between tissues are highly consistent across individuals. We developed a novel statistical model to predict locus-specific methylation in target tissue based on methylation in surrogate tissue. The method was evaluated in publicly available data and in two studies using the latest IlluminaBeadChips: a childhood asthma study with methylation measured in both peripheral blood leukocytes (PBL) and lymphoblastoid cell lines; and a study of postoperative atrial fibrillation with methylation in PBL, atrium and artery. We found that our method can greatly improve accuracy of cross-tissue prediction at CpG sites that are variable in the target tissue [R(2) increases from 0.38 (original R(2) between tissues) to 0.89 for PBL-to-artery prediction; from 0.39 to 0.95 for PBL-to-atrium; and from 0.81 to 0.98 for lymphoblastoid cell line-to-PBL based on cross-validation, and confirmed using cross-study prediction]. An extended model with multiple CpGs further improved performance. Our results suggest that large-scale epidemiology studies using easy-to-access surrogate tissues (e.g. blood) could be recalibrated to improve understanding of epigenetics in hard-to-access tissues (e.g. atrium) and might enable non-invasive disease screening using epigenetic profiles.
组织间甲基化的差异对细胞分化至关重要,也是理解表观遗传学在复杂疾病中作用的关键。在本研究中,我们发现不同个体间组织特异性甲基化差异高度一致。我们开发了一种新型统计模型,可根据替代组织中的甲基化情况预测目标组织中的位点特异性甲基化。该方法在公开可用数据以及两项使用最新IlluminaBeadChips的研究中进行了评估:一项儿童哮喘研究,测量了外周血白细胞(PBL)和淋巴母细胞系中的甲基化;另一项是关于术后心房颤动的研究,测量了PBL、心房和动脉中的甲基化。我们发现,对于目标组织中可变的CpG位点,我们的方法可大幅提高跨组织预测的准确性[基于交叉验证,PBL到动脉预测的R(2)从0.38(组织间原始R(2))提高到0.89;PBL到心房从0.39提高到0.95;基于交叉研究预测,淋巴母细胞系到PBL从0.81提高到0.98]。包含多个CpG的扩展模型进一步提升了性能。我们的结果表明,使用易于获取的替代组织(如血液)进行的大规模流行病学研究可重新校准,以增进对难以获取的组织(如心房)中表观遗传学的理解,并可能实现基于表观遗传图谱的非侵入性疾病筛查。