Suppr超能文献

CoMM-S:一种在全转录组关联研究中使用汇总水平的eQTL和GWAS数据集的协作混合模型。

CoMM-S: A Collaborative Mixed Model Using Summary-Level eQTL and GWAS Datasets in Transcriptome-Wide Association Studies.

作者信息

Yang Yi, Yeung Kar-Fu, Liu Jin

机构信息

Centre for Quantitative Medicine, Program in Health Services and Systems Research, Duke-NUS Medical School, Singapore, Singapore.

出版信息

Front Genet. 2021 Sep 20;12:704538. doi: 10.3389/fgene.2021.704538. eCollection 2021.

Abstract

Genome-wide association studies (GWAS) have achieved remarkable success in identifying SNP-trait associations in the last decade. However, it is challenging to identify the mechanisms that connect the genetic variants with complex traits as the majority of GWAS associations are in non-coding regions. Methods that integrate genomic and transcriptomic data allow us to investigate how genetic variants may affect a trait through their effect on gene expression. These include CoMM and CoMM-S, likelihood-ratio-based methods that integrate GWAS and eQTL studies to assess expression-trait association. However, their reliance on individual-level eQTL data render them inapplicable when only summary-level eQTL results, such as those from large-scale eQTL analyses, are available. We develop an efficient probabilistic model, CoMM-S, to explore the expression-trait association using summary-level eQTL and GWAS datasets. Compared with CoMM-S, which uses individual-level eQTL data, CoMM-S requires only summary-level eQTL data. To test expression-trait association, an efficient variational Bayesian EM algorithm and a likelihood ratio test were constructed. We applied CoMM-S to both simulated and real data. The simulation results demonstrate that CoMM-S can perform as well as CoMM-S and S-PrediXcan, and analyses using GWAS summary statistics from Biobank Japan and eQTL summary statistics from eQTLGen and GTEx suggest novel susceptibility loci for cardiovascular diseases and osteoporosis. The developed R package is available at https://github.com/gordonliu810822/CoMM.

摘要

在过去十年中,全基因组关联研究(GWAS)在识别单核苷酸多态性(SNP)与性状的关联方面取得了显著成功。然而,由于大多数GWAS关联位于非编码区域,确定将基因变异与复杂性状联系起来的机制具有挑战性。整合基因组和转录组数据的方法使我们能够研究基因变异如何通过对基因表达的影响来影响性状。这些方法包括CoMM和CoMM-S,它们是基于似然比的方法,整合了GWAS和表达定量性状位点(eQTL)研究来评估表达与性状的关联。然而,它们对个体水平的eQTL数据的依赖使得当只有汇总水平的eQTL结果(如来自大规模eQTL分析的结果)可用时,它们就不适用了。我们开发了一种高效的概率模型CoMM-S,以使用汇总水平的eQTL和GWAS数据集来探索表达与性状的关联。与使用个体水平eQTL数据的CoMM-S相比,CoMM-S只需要汇总水平的eQTL数据。为了测试表达与性状的关联,构建了一种高效的变分贝叶斯期望最大化(EM)算法和似然比检验。我们将CoMM-S应用于模拟数据和真实数据。模拟结果表明,CoMM-S的性能与CoMM-S和S-PrediXcan相当,并且使用日本生物银行的GWAS汇总统计数据以及eQTLGen和基因型组织表达(GTEx)的eQTL汇总统计数据进行的分析表明了心血管疾病和骨质疏松症的新的易感基因座。已开发的R包可在https://github.com/gordonliu810822/CoMM上获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f3e6/8488198/5fee783ef06f/fgene-12-704538-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验