Liu Chunyu, Joehanes Roby, Ma Jiantao, Wang Yuxuan, Sun Xianbang, Keshawarz Amena, Sooda Meera, Huan Tianxiao, Hwang Shih-Jen, Bui Helena, Tejada Brandon, Munson Peter J, Cumhur Demirkale, Heard-Costa Nancy L, Pitsillides Achilleas N, Peloso Gina M, Feolo Michael, Sharopova Nataliya, Vasan Ramachandran S, Levy Daniel
Department of Biostatistics, School of Public Health, Boston University, Boston, MA, USA.
Framingham Heart Study, Framingham, MA, USA.
medRxiv. 2022 May 3:2022.04.13.22273841. doi: 10.1101/2022.04.13.22273841.
To create a scientific resource of expression quantitative trail loci (eQTL), we conducted a genome-wide association study (GWAS) using genotypes obtained from whole genome sequencing (WGS) of DNA and gene expression levels from RNA sequencing (RNA-seq) of whole blood in 2622 participants in Framingham Heart Study. We identified 6,778,286 -eQTL variant-gene transcript (eGene) pairs at <5×10 (2,855,111 unique -eQTL variants and 15,982 unique eGenes) and 1,469,754 -eQTL variant-eGene pairs at <1e-12 (526,056 unique -eQTL variants and 7,233 unique eGenes). In addition, 442,379 -eQTL variants were associated with expression of 1518 long non-protein coding RNAs (lncRNAs). Gene Ontology (GO) analyses revealed that the top GO terms for eGenes are enriched for immune functions (FDR <0.05). The -eQTL variants are enriched for SNPs reported to be associated with 815 traits in prior GWAS, including cardiovascular disease risk factors. As proof of concept, we used this eQTL resource in conjunction with genetic variants from public GWAS databases in causal inference testing (e.g., COVID-19 severity). After Bonferroni correction, Mendelian randomization analyses identified putative causal associations of 60 eGenes with systolic blood pressure, 13 genes with coronary artery disease, and seven genes with COVID-19 severity. This study created a comprehensive eQTL resource via BioData Catalyst that will be made available to the scientific community. This will advance understanding of the genetic architecture of gene expression underlying a wide range of diseases.
为了创建一个表达数量性状基因座(eQTL)的科学资源,我们利用弗雷明汉心脏研究中2622名参与者的全基因组测序(WGS)获得的基因型以及全血RNA测序(RNA-seq)的基因表达水平进行了一项全基因组关联研究(GWAS)。我们在<5×10时鉴定出6,778,286个eQTL变异-基因转录本(eGene)对(2,855,111个独特的eQTL变异和15,982个独特的eGene),在<1e-12时鉴定出1,469,754个eQTL变异-eGene对(526,056个独特的eQTL变异和7,233个独特的eGene)。此外,442,379个eQTL变异与1518个长链非编码RNA(lncRNA)的表达相关。基因本体论(GO)分析显示,eGene的顶级GO术语在免疫功能方面富集(FDR<0.05)。eQTL变异在先前GWAS中报告与815个性状相关的单核苷酸多态性(SNP)中富集,包括心血管疾病风险因素。作为概念验证,我们将此eQTL资源与来自公共GWAS数据库的遗传变异结合用于因果推断测试(例如,COVID-19严重程度)。经过Bonferroni校正后,孟德尔随机化分析确定了60个eGene与收缩压、13个基因与冠状动脉疾病以及7个基因与COVID-19严重程度之间的推定因果关联。这项研究通过生物数据催化剂创建了一个全面的eQTL资源,该资源将提供给科学界。这将促进对广泛疾病潜在基因表达的遗传结构的理解。