de Bakker Paul I W, Yelensky Roman, Pe'er Itsik, Gabriel Stacey B, Daly Mark J, Altshuler David
Center for Human Genetic Research, Massachusetts General Hospital, 185 Cambridge Street, CPZN-6818, Boston, Massachusetts 02114-2790, USA.
Nat Genet. 2005 Nov;37(11):1217-23. doi: 10.1038/ng1669. Epub 2005 Oct 23.
We investigated selection and analysis of tag SNPs for genome-wide association studies by specifically examining the relationship between investment in genotyping and statistical power. Do pairwise or multimarker methods maximize efficiency and power? To what extent is power compromised when tags are selected from an incomplete resource such as HapMap? We addressed these questions using genotype data from the HapMap ENCODE project, association studies simulated under a realistic disease model, and empirical correction for multiple hypothesis testing. We demonstrate a haplotype-based tagging method that uniformly outperforms single-marker tests and methods for prioritization that markedly increase tagging efficiency. Examining all observed haplotypes for association, rather than just those that are proxies for known SNPs, increases power to detect rare causal alleles, at the cost of reduced power to detect common causal alleles. Power is robust to the completeness of the reference panel from which tags are selected. These findings have implications for prioritizing tag SNPs and interpreting association studies.
我们通过专门研究基因分型投入与统计效能之间的关系,对全基因组关联研究中标签单核苷酸多态性(tag SNPs)的选择和分析进行了调查。成对或多标记方法能将效率和效能最大化到什么程度?当从诸如HapMap这样的不完整资源中选择标签时,效能会在多大程度上受到影响?我们使用来自HapMap ENCODE项目的基因型数据、在现实疾病模型下模拟的关联研究以及针对多重假设检验的经验性校正来解决这些问题。我们展示了一种基于单倍型的标签方法,该方法始终优于单标记测试以及能显著提高标签效率的优先级排序方法。检查所有观察到的单倍型的关联性,而不仅仅是那些已知单核苷酸多态性(SNPs)的代理单倍型,会增加检测罕见因果等位基因的效能,但代价是检测常见因果等位基因的效能降低。效能对于选择标签所依据的参考面板的完整性具有稳健性。这些发现对于标签单核苷酸多态性的优先级排序和关联研究的解释具有重要意义。