Department of Medical Genetics, Cambridge Institute for Medical Research, Cambridge University, Cambridge CB2 0XY, UK.
Biostatistics. 2010 Oct;11(4):661-73. doi: 10.1093/biostatistics/kxq035. Epub 2010 Jun 3.
Homer and others (2008. Resolving individuals contributing trace amounts of DNA to highly complex mixtures using high-density SNP genotyping microarrays. PLoS Genetics 4, e1000167) recently showed that, given allele frequency data for a large number of single nucleotide polymorphisms in a sample together with corresponding population "reference" frequencies, by typing an individual's DNA sample at the same set of loci it can be inferred whether or not the individual was a member of the sample. This observation has been responsible for precautionary removal of large amounts of summary data from public access. This and further work on the problem has followed a frequentist approach. This paper sets out a Bayesian analysis of this problem which clarifies the role of the reference frequencies and allows incorporation of prior probabilities of the individual's membership in the sample.
荷马等人(2008. 使用高密度 SNP 基因分型微阵列解决痕量 DNA 对高度复杂混合物的个体贡献问题。PLoS Genetics 4, e1000167)最近表明,给定样本中大量单核苷酸多态性的等位基因频率数据以及相应的群体“参考”频率,通过对个体 DNA 样本进行相同的基因座分型,可以推断该个体是否是样本的成员。这一观察结果导致大量摘要数据从公共访问中被谨慎删除。这个问题的进一步研究遵循了频率主义方法。本文对该问题进行了贝叶斯分析,阐明了参考频率的作用,并允许将个体在样本中的成员概率先验纳入分析。