Division of Pulmonary Medicine, Department of Pediatrics, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USA.
Department of Human Genetics, School of Public Health, University of Pittsburgh, Pittsburgh, PA, USA.
Clin Epigenetics. 2024 Nov 18;16(1):161. doi: 10.1186/s13148-024-01776-x.
DNA methylation is a critical regulatory mechanism of gene expression, influencing various human diseases and traits. While traditional expression quantitative trait loci (eQTL) studies have helped elucidate the genetic regulation of gene expression, there is a growing need to explore environmental influences on gene expression. Existing methods such as PrediXcan and FUSION focus on genotype-based associations but overlook the impact of environmental factors. To address this gap, we present MOSES (methylation-based gene association), a novel approach that utilizes DNA methylation to identify environmentally regulated genes associated with traits or diseases without relying on measured gene expression.
MOSES involves training, imputation, and association testing. It employs elastic-net penalized regression models to estimate the influence of CpGs and SNPs (if available) on gene expression. We developed and compared four MOSES versions incorporating different methylation and genetic data: (1) cis-DNA methylation within 1 Mb of promoter regions, (2) both cis-SNPs and cis-CpGs, 3) both cis- and a part of trans- CpGs (±5Mb away) from promoter regions), and 4) long-range DNA methylation (±10 Mb away) from promoter regions. Our analysis using nasal epithelium and white blood cell data from the Epigenetic Variation and Childhood Asthma in Puerto Ricans (EVA-PR) study demonstrated that MOSES, particularly the version incorporating long-range CpGs (MOSES-DNAm 10 M), significantly outperformed existing methods like PrediXcan, MethylXcan, and Biomethyl in predicting gene expression. MOSES-DNAm 10 M identified more differentially expressed genes (DEGs) associated with atopic asthma, particularly those involved in immune pathways, highlighting its superior performance in uncovering environmentally regulated genes. Further application of MOSES to lung tissue data from idiopathic pulmonary fibrosis (IPF) patients confirmed its robustness and versatility across different diseases and tissues.
MOSES represents an innovative advancement in gene association studies, leveraging DNA methylation to capture the influence of environmental factors on gene expression. By incorporating long-range CpGs, MOSES-DNAm 10 M provides superior predictive accuracy and gene association capabilities compared to traditional genotype-based methods. This novel approach offers valuable insights into the complex interplay between genetics and the environment, enhancing our understanding of disease mechanisms and potentially guiding therapeutic strategies. The user-friendly MOSES R package is publicly available to advance studies in various diseases, including immune-related conditions like asthma.
DNA 甲基化是基因表达的关键调控机制,影响着各种人类疾病和特征。虽然传统的表达数量性状基因座(eQTL)研究有助于阐明基因表达的遗传调控,但越来越需要探索环境对基因表达的影响。现有的方法,如 PrediXcan 和 FUSION,主要关注基于基因型的关联,但忽略了环境因素的影响。为了解决这一差距,我们提出了 MOSES(基于甲基化的基因关联),这是一种新颖的方法,利用 DNA 甲基化来识别与特征或疾病相关的、不受测量基因表达影响的、受环境调节的基因。
MOSES 包括训练、插补和关联测试。它采用弹性网络惩罚回归模型来估计 CpG 和 SNP(如果可用)对基因表达的影响。我们开发并比较了四个 MOSES 版本,它们结合了不同的甲基化和遗传数据:(1)启动子区域内 1Mb 范围内的顺式 DNA 甲基化;(2)顺式 SNPs 和顺式 CpG;(3)顺式和一部分来自启动子区域的±5Mb 处的转座 CpG;(4)来自启动子区域的±10Mb 处的长距离 DNA 甲基化。我们使用 EVA-PR 研究中的鼻上皮和白细胞数据进行的分析表明,MOSES,特别是结合了长距离 CpG 的版本(MOSES-DNAm 10M),在预测基因表达方面明显优于 PrediXcan、MethylXcan 和 Biomethyl 等现有方法。MOSES-DNAm 10M 鉴定了更多与特应性哮喘相关的差异表达基因(DEG),特别是那些与免疫途径相关的基因,这突出了其在揭示受环境调节的基因方面的优越性能。MOSES 进一步应用于特发性肺纤维化(IPF)患者的肺组织数据,证实了其在不同疾病和组织中的稳健性和通用性。
MOSES 代表了基因关联研究的一项创新性进展,利用 DNA 甲基化来捕捉环境因素对基因表达的影响。通过结合长距离 CpG,MOSES-DNAm 10M 与传统的基于基因型的方法相比,提供了更高的预测准确性和基因关联能力。这种新方法提供了对遗传和环境之间复杂相互作用的深入了解,增强了我们对疾病机制的认识,并可能指导治疗策略。用户友好的 MOSES R 包可供公开使用,以推进各种疾病的研究,包括哮喘等与免疫相关的疾病。