Suppr超能文献

CoMM-S2:一种基于转录组关联研究汇总统计信息的协作混合模型。

CoMM-S2: a collaborative mixed model using summary statistics in transcriptome-wide association studies.

机构信息

School of Statistics and Management, Shanghai University of Finance and Economics, Shanghai 200433, China.

Centre for Quantitative Medicine, Program in Health Services & Systems Research, Duke-NUS Medical School, 169857, Singapore.

出版信息

Bioinformatics. 2020 Apr 1;36(7):2009-2016. doi: 10.1093/bioinformatics/btz880.

Abstract

MOTIVATION

Although genome-wide association studies (GWAS) have deepened our understanding of the genetic architecture of complex traits, the mechanistic links that underlie how genetic variants cause complex traits remains elusive. To advance our understanding of the underlying mechanistic links, various consortia have collected a vast volume of genomic data that enable us to investigate the role that genetic variants play in gene expression regulation. Recently, a collaborative mixed model (CoMM) was proposed to jointly interrogate genome on complex traits by integrating both the GWAS dataset and the expression quantitative trait loci (eQTL) dataset. Although CoMM is a powerful approach that leverages regulatory information while accounting for the uncertainty in using an eQTL dataset, it requires individual-level GWAS data and cannot fully make use of widely available GWAS summary statistics. Therefore, statistically efficient methods that leverages transcriptome information using only summary statistics information from GWAS data are required.

RESULTS

In this study, we propose a novel probabilistic model, CoMM-S2, to examine the mechanistic role that genetic variants play, by using only GWAS summary statistics instead of individual-level GWAS data. Similar to CoMM which uses individual-level GWAS data, CoMM-S2 combines two models: the first model examines the relationship between gene expression and genotype, while the second model examines the relationship between the phenotype and the predicted gene expression from the first model. Distinct from CoMM, CoMM-S2 requires only GWAS summary statistics. Using both simulation studies and real data analysis, we demonstrate that even though CoMM-S2 utilizes GWAS summary statistics, it has comparable performance as CoMM, which uses individual-level GWAS data.

AVAILABILITY AND IMPLEMENTATION

The implement of CoMM-S2 is included in the CoMM package that can be downloaded from https://github.com/gordonliu810822/CoMM.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

尽管全基因组关联研究(GWAS)加深了我们对复杂性状遗传结构的理解,但遗传变异如何导致复杂性状的机制联系仍然难以捉摸。为了深入了解潜在的机制联系,各个联盟收集了大量基因组数据,使我们能够研究遗传变异在基因表达调控中的作用。最近,提出了一种协同混合模型(CoMM),通过整合 GWAS 数据集和表达数量性状基因座(eQTL)数据集,共同研究复杂性状的基因组。虽然 CoMM 是一种强大的方法,利用调控信息,同时考虑使用 eQTL 数据集的不确定性,但它需要个体水平的 GWAS 数据,并且不能充分利用广泛可用的 GWAS 汇总统计信息。因此,需要利用仅从 GWAS 数据中汇总统计信息的转录组信息的统计上有效的方法。

结果

在这项研究中,我们提出了一种新的概率模型 CoMM-S2,通过仅使用 GWAS 汇总统计信息而不是个体水平的 GWAS 数据,来检查遗传变异所起的机制作用。与使用个体水平的 GWAS 数据的 CoMM 相似,CoMM-S2 结合了两个模型:第一个模型检查基因表达与基因型之间的关系,第二个模型检查表型与第一个模型中预测的基因表达之间的关系。与 CoMM 不同,CoMM-S2 仅需要 GWAS 汇总统计信息。通过模拟研究和真实数据分析,我们证明了即使 CoMM-S2 使用 GWAS 汇总统计信息,它的性能也与使用个体水平的 GWAS 数据的 CoMM 相当。

可用性和实现

CoMM-S2 的实现包含在可从 https://github.com/gordonliu810822/CoMM 下载的 CoMM 包中。

补充信息

补充数据可在 Bioinformatics 在线获取。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验