Suppr超能文献

DNA甲基化数据全表观基因组分析中缺失协变量值的插补

Imputation of missing covariate values in epigenome-wide analysis of DNA methylation data.

作者信息

Wu Chong, Demerath Ellen W, Pankow James S, Bressler Jan, Fornage Myriam, Grove Megan L, Chen Wei, Guan Weihua

机构信息

a Division of Biostatistics, School of Public Health, University of Minnesota , Minneapolis , MN , USA.

b Division of Epidemiology & Community Health, School of Public Health, University of Minnesota , Minneapolis , MN , USA.

出版信息

Epigenetics. 2016;11(2):132-9. doi: 10.1080/15592294.2016.1145328. Epub 2016 Feb 18.

Abstract

DNA methylation is a widely studied epigenetic mechanism and alterations in methylation patterns may be involved in the development of common diseases. Unlike inherited changes in genetic sequence, variation in site-specific methylation varies by tissue, developmental stage, and disease status, and may be impacted by aging and exposure to environmental factors, such as diet or smoking. These non-genetic factors are typically included in epigenome-wide association studies (EWAS) because they may be confounding factors to the association between methylation and disease. However, missing values in these variables can lead to reduced sample size and decrease the statistical power of EWAS. We propose a site selection and multiple imputation (MI) method to impute missing covariate values and to perform association tests in EWAS. Then, we compare this method to an alternative projection-based method. Through simulations, we show that the MI-based method is slightly conservative, but provides consistent estimates for effect size. We also illustrate these methods with data from the Atherosclerosis Risk in Communities (ARIC) study to carry out an EWAS between methylation levels and smoking status, in which missing cell type compositions and white blood cell counts are imputed.

摘要

DNA甲基化是一种被广泛研究的表观遗传机制,甲基化模式的改变可能与常见疾病的发生有关。与遗传序列的遗传变化不同,位点特异性甲基化的变异因组织、发育阶段和疾病状态而异,并且可能受到衰老以及饮食或吸烟等环境因素暴露的影响。这些非遗传因素通常包含在全表观基因组关联研究(EWAS)中,因为它们可能是甲基化与疾病之间关联的混杂因素。然而,这些变量中的缺失值可能导致样本量减少,并降低EWAS的统计效力。我们提出一种位点选择和多重填补(MI)方法,用于填补协变量的缺失值并在EWAS中进行关联检验。然后,我们将此方法与另一种基于投影的方法进行比较。通过模拟,我们表明基于MI的方法略显保守,但能提供一致的效应量估计。我们还使用社区动脉粥样硬化风险(ARIC)研究的数据来说明这些方法,以开展甲基化水平与吸烟状态之间的EWAS,其中缺失的细胞类型组成和白细胞计数被填补。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/767c/4846117/7790b6fe243f/kepi-11-02-1145328-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验