Ziegler Andreas, König Inke R, Thompson John R
Institut für Medizinische Biometrie und Statistik, Universität zu Lübeck, Ratzeburger Allee 160, 23538 Lübeck, Germany.
Biom J. 2008 Feb;50(1):8-28. doi: 10.1002/bimj.200710398.
To search the entire human genome for association is a novel and promising approach to unravelling the genetic basis of complex genetic diseases. In these genome-wide association studies (GWAs), several hundreds of thousands of single nucleotide polymorphisms (SNPs) are analyzed at the same time, posing substantial biostatistical and computational challenges. In this paper, we discuss a number of biostatistical aspects of GWAs in detail. We specifically consider quality control issues and show that signal intensity plots are a sine qua condition non in today's GWAs. Approaches to detect and adjust for population stratification are briefly examined. We discuss different strategies aimed at tackling the problem of multiple testing, including adjustment of p -values, the false positive report probability and the false discovery rate. Another aspect of GWAs requiring special attention is the search for gene-gene and gene-environment interactions. We finally describe multistage approaches to GWAs.
在整个人类基因组中进行关联研究是一种新颖且有前景的方法,用于揭示复杂遗传疾病的遗传基础。在这些全基因组关联研究(GWAs)中,同时会分析数十万个单核苷酸多态性(SNP),这带来了巨大的生物统计学和计算挑战。在本文中,我们将详细讨论GWAs的一些生物统计学方面。我们特别考虑质量控制问题,并表明信号强度图在当今的GWAs中是必不可少的条件。简要探讨了检测和调整群体分层的方法。我们讨论了旨在解决多重检验问题的不同策略,包括p值调整、假阳性报告概率和错误发现率。GWAs需要特别关注的另一个方面是寻找基因-基因和基因-环境相互作用。我们最后描述了GWAs的多阶段方法。