Suppr超能文献

用于全基因组关联研究中推断汇总统计数据的统一框架。

A Unifying Framework for Imputing Summary Statistics in Genome-Wide Association Studies.

作者信息

Wu Yue, Eskin Eleazar, Sankararaman Sriram

机构信息

Department of Computer Science, University of California, Los Angeles, Los Angeles.

Department of Human Genetics, and University of California, Los Angeles, Los Angeles.

出版信息

J Comput Biol. 2020 Mar;27(3):418-428. doi: 10.1089/cmb.2019.0449. Epub 2020 Feb 13.

Abstract

Methods to impute missing data are routinely used to increase power in genome-wide association studies. There are two broad classes of imputation methods. The first class imputes genotypes at the untyped variants, given those at the typed variants, and then performs a statistical test of association at the imputed variants. The second class, summary statistic imputation (SSI), directly imputes association statistics at the untyped variants, given the association statistics observed at the typed variants. The second class is appealing as it tends to be computationally efficient while only requiring the summary statistics from a study, while the former class requires access to individual-level data that can be difficult to obtain. The statistical properties of these two classes of imputation methods have not been fully understood. In this study, we show that the two classes of imputation methods yield association statistics with similar distributions for sufficiently large sample sizes. Using this relationship, we can understand the effect of the imputation method on power. We show that a commonly used approach to SSI that we term SSI with variance reweighting generally leads to a loss in power. On the contrary, our proposed method for SSI that does not perform variance reweighting fully accounts for imputation uncertainty, while achieving better power.

摘要

方法来填补缺失的数据通常用于提高全基因组关联研究的功效。有两种广泛的填补方法。第一类填补方法在给定已分型变异的情况下,对未分型变异的基因型进行填补,然后在填补的变异体上进行关联统计检验。第二类,汇总统计量填补(SSI),直接在未分型变异体上进行关联统计量的填补,给定在分型变异体上观察到的关联统计量。第二类方法很有吸引力,因为它在只需要研究的汇总统计量的情况下往往计算效率高,而第一类方法则需要访问个体水平的数据,这可能很难获得。这两类填补方法的统计特性尚未完全理解。在这项研究中,我们表明,在足够大的样本量下,这两类填补方法产生的关联统计量具有相似的分布。利用这种关系,我们可以了解填补方法对功效的影响。我们表明,我们称之为方差重新加权的 SSI 的常用 SSI 方法通常会导致功效降低。相反,我们提出的不进行方差重新加权的 SSI 方法完全考虑了填补不确定性,同时实现了更好的功效。

相似文献

4
DIST: direct imputation of summary statistics for unmeasured SNPs.直接对未测量的 SNP 进行汇总统计的推断。
Bioinformatics. 2013 Nov 15;29(22):2925-7. doi: 10.1093/bioinformatics/btt500. Epub 2013 Aug 28.
9
Genotype imputation in genome-wide association studies.全基因组关联研究中的基因型填充
Curr Protoc Hum Genet. 2013 Jul;Chapter 1:Unit 1.25. doi: 10.1002/0471142905.hg0125s78.

本文引用的文献

2
Identification of causal genes for complex traits.复杂性状因果基因的鉴定。
Bioinformatics. 2015 Jun 15;31(12):i206-13. doi: 10.1093/bioinformatics/btv240.
5
DIST: direct imputation of summary statistics for unmeasured SNPs.直接对未测量的 SNP 进行汇总统计的推断。
Bioinformatics. 2013 Nov 15;29(22):2925-7. doi: 10.1093/bioinformatics/btt500. Epub 2013 Aug 28.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验