Han Buhm, Duong Dat, Sul Jae Hoon, de Bakker Paul I W, Eskin Eleazar, Raychaudhuri Soumya
Department of Convergence Medicine, University of Ulsan College of Medicine & Asan Institute for Life Sciences, Asan Medical Center, Seoul 138-736, Republic of Korea,
Computer Science Department.
Hum Mol Genet. 2016 May 1;25(9):1857-66. doi: 10.1093/hmg/ddw049. Epub 2016 Feb 21.
Meta-analysis strategies have become critical to augment power of genome-wide association studies (GWAS). To reduce genotyping or sequencing cost, many studies today utilize shared controls, and these individuals can inadvertently overlap among multiple studies. If these overlapping individuals are not taken into account in meta-analysis, they can induce spurious associations. In this article, we propose a general framework for adjusting association statistics to account for overlapping subjects within a meta-analysis. The key idea of our method is to transform the covariance structure of the data, so it can be used in downstream analyses. As a result, the strategy is very flexible and allows a wide range of meta-analysis methods, such as the random effects model, to account for overlapping subjects. Using simulations and real datasets, we demonstrate that our method has utility in meta-analyses of GWAS, as well as in a multi-tissue mouse expression quantitative trait loci (eQTL) study where our method increases the number of discovered eQTL by up to 19% compared with existing methods.
荟萃分析策略对于增强全基因组关联研究(GWAS)的效能已变得至关重要。为降低基因分型或测序成本,如今许多研究采用共享对照,而这些个体可能在多项研究中不经意地出现重叠。如果在荟萃分析中未考虑这些重叠个体,它们可能会导致虚假关联。在本文中,我们提出了一个通用框架,用于调整关联统计量,以在荟萃分析中考虑重叠受试者。我们方法的关键思想是变换数据的协方差结构,以便可用于下游分析。因此,该策略非常灵活,允许使用多种荟萃分析方法,如随机效应模型,来考虑重叠受试者。通过模拟和真实数据集,我们证明我们的方法在GWAS的荟萃分析以及多组织小鼠表达定量性状位点(eQTL)研究中都有用处,在该研究中,与现有方法相比,我们的方法发现的eQTL数量最多可增加19%。