Jacobs Kevin B, Yeager Meredith, Wacholder Sholom, Craig David, Kraft Peter, Hunter David J, Paschal Justin, Manolio Teri A, Tucker Margaret, Hoover Robert N, Thomas Gilles D, Chanock Stephen J, Chatterjee Nilanjan
Nat Genet. 2009 Nov;41(11):1253-7. doi: 10.1038/ng.455. Epub 2009 Oct 4.
Aggregate results from genome-wide association studies (GWAS), such as genotype frequencies for cases and controls, were until recently often made available on public websites because they were thought to disclose negligible information concerning an individual's participation in a study. Homer et al. recently suggested that a method for forensic detection of an individual's contribution to an admixed DNA sample could be applied to aggregate GWAS data. Using a likelihood-based statistical framework, we developed an improved statistic that uses genotype frequencies and individual genotypes to infer whether a specific individual or any close relatives participated in the GWAS and, if so, what the participant's phenotype status is. Our statistic compares the logarithm of genotype frequencies, in contrast to that of Homer et al., which is based on differences in either SNP probe intensity or allele frequencies. We derive the theoretical power of our test statistics and explore the empirical performance in scenarios with varying numbers of randomly chosen or top-associated SNPs.
全基因组关联研究(GWAS)的汇总结果,如病例组和对照组的基因型频率,直到最近通常仍在公共网站上公开,因为人们认为这些信息不会泄露个人参与研究的重要信息。霍默等人最近提出,一种用于法医检测个体对混合DNA样本贡献的方法可应用于GWAS汇总数据。我们使用基于似然性的统计框架,开发了一种改进的统计量,该统计量利用基因型频率和个体基因型来推断特定个体或其任何近亲是否参与了GWAS,如果参与了,参与者的表型状态是什么。与霍默等人的统计量不同,我们的统计量比较的是基因型频率的对数,霍默等人的统计量基于单核苷酸多态性(SNP)探针强度或等位基因频率的差异。我们推导了检验统计量的理论效能,并在随机选择或与性状高度相关的SNP数量不同的情况下探索了实证表现。