Dickhaus Thorsten
Stat Appl Genet Mol Biol. 2015 Aug;14(4):347-60. doi: 10.1515/sagmb-2014-0052.
Genetic association studies lead to simultaneous categorical data analysis. The sample for every genetic locus consists of a contingency table containing the numbers of observed genotype-phenotype combinations. Under case-control design, the row counts of every table are identical and fixed, while column counts are random. The aim of the statistical analysis is to test independence of the phenotype and the genotype at every locus. We present an objective Bayesian methodology for these association tests, which relies on the conjugacy of Dirichlet and multinomial distributions. Being based on the likelihood principle, the Bayesian tests avoid looping over all tables with given marginals. Making use of data generated by The Wellcome Trust Case Control Consortium (WTCCC), we illustrate that the ordering of the Bayes factors shows a good agreement with that of frequentist p-values. Furthermore, we deal with specifying prior probabilities for the validity of the null hypotheses, by taking linkage disequilibrium structure into account and exploiting the concept of effective numbers of tests. Application of a Bayesian decision theoretic multiple test procedure to the WTCCC data illustrates the proposed methodology. Finally, we discuss two methods for reconciling frequentist and Bayesian approaches to the multiple association test problem.
基因关联研究导致同时进行分类数据分析。每个基因座的样本由一个列联表组成,该表包含观察到的基因型 - 表型组合的数量。在病例对照设计下,每个表的行计数是相同且固定的,而列计数是随机的。统计分析的目的是检验每个基因座处表型和基因型的独立性。我们提出了一种用于这些关联检验的客观贝叶斯方法,该方法依赖于狄利克雷分布和多项分布的共轭性。基于似然原理,贝叶斯检验避免了在具有给定边际的所有表上进行循环。利用威康信托病例对照协会(WTCCC)生成的数据,我们表明贝叶斯因子的排序与频率主义p值的排序显示出良好的一致性。此外,我们通过考虑连锁不平衡结构并利用有效检验数的概念来处理为零假设的有效性指定先验概率的问题。将贝叶斯决策理论多重检验程序应用于WTCCC数据说明了所提出的方法。最后,我们讨论了两种协调频率主义和贝叶斯方法以解决多重关联检验问题的方法。