Liu Chunyu, Joehanes Roby, Ma Jiantao, Wang Yuxuan, Sun Xianbang, Keshawarz Amena, Sooda Meera, Huan Tianxiao, Hwang Shih-Jen, Bui Helena, Tejada Brandon, Munson Peter J, Cumhur Demirkale, Heard-Costa Nancy L, Pitsillides Achilleas N, Peloso Gina M, Feolo Michael, Sharopova Nataliya, Vasan Ramachandran S, Levy Daniel
Boston University.
National Institutes of Health.
Res Sq. 2022 May 31:rs.3.rs-1598646. doi: 10.21203/rs.3.rs-1598646/v1.
To create a scientific resource of expression quantitative trail loci (eQTL), we conducted a genome-wide association study (GWAS) using genotypes obtained from whole genome sequencing (WGS) of DNA and gene expression levels from RNA sequencing (RNA-seq) of whole blood in 2622 participants in Framingham Heart Study. We identified 6,778,286 -eQTL variant-gene transcript (eGene) pairs at < 5x10 (2,855,111 unique -eQTL variants and 15,982 unique eGenes) and 1,469,754 -eQTL variant-eGene pairs at < 1e-12 (526,056 unique -eQTL variants and 7,233 unique eGenes). In addition, 442,379 -eQTL variants were associated with expression of 1518 long non-protein coding RNAs (lncRNAs). Gene Ontology (GO) analyses revealed that the top GO terms for eGenes are enriched for immune functions (FDR < 0.05). The -eQTL variants are enriched for SNPs reported to be associated with 815 traits in prior GWAS, including cardiovascular disease risk factors. As proof of concept, we used this eQTL resource in conjunction with genetic variants from public GWAS databases in causal inference testing (e.g., COVID-19 severity). After Bonferroni correction, Mendelian randomization analyses identified putative causal associations of 60 eGenes with systolic blood pressure, 13 genes with coronary artery disease, and seven genes with COVID-19 severity. This study created a comprehensive eQTL resource via BioData Catalyst that will be made available to the scientific community. This will advance understanding of the genetic architecture of gene expression underlying a wide range of diseases.
为了创建一个表达数量性状基因座(eQTL)的科学资源,我们利用弗雷明汉心脏研究中2622名参与者的全基因组测序(WGS)获得的基因型和全血RNA测序(RNA-seq)的基因表达水平进行了一项全基因组关联研究(GWAS)。我们在p<5×10⁻⁸时鉴定出6,778,286个eQTL变异-基因转录本(eGene)对(2,855,111个独特的eQTL变异和15,982个独特的eGene),在p<1×10⁻¹²时鉴定出1,469,754个eQTL变异-eGene对(526,056个独特的eQTL变异和7,233个独特的eGene)。此外,442,379个eQTL变异与1518个长链非编码RNA(lncRNA)的表达相关。基因本体(GO)分析显示,eGene的顶级GO术语在免疫功能方面富集(FDR<0.05)。这些eQTL变异在先前GWAS中报道的与815个性状相关的单核苷酸多态性(SNP)中富集,包括心血管疾病风险因素。作为概念验证,我们在因果推断测试(如COVID-19严重程度)中结合使用了这个eQTL资源和来自公共GWAS数据库的遗传变异。经过Bonferroni校正后,孟德尔随机化分析确定了60个eGene与收缩压、13个基因与冠状动脉疾病以及7个基因与COVID-19严重程度之间的推定因果关联。本研究通过生物数据催化剂创建了一个全面的eQTL资源,将向科学界提供。这将推进对广泛疾病潜在基因表达遗传结构的理解。