Department of Mathematical Sciences, Michigan Technological University, Houghton, MI 49931, USA.
Eur J Hum Genet. 2010 May;18(5):596-603. doi: 10.1038/ejhg.2009.201. Epub 2009 Nov 25.
Recently, Steen et al proposed a novel two-stage approach for family-based genome-wide association studies. In the first stage, a test based on between-family information is used to rank SNPs according to their P-values or conditional power of the test. In the second stage, the R most promising SNPs are tested using a family-based association test. We call this two-stage approach top R method. Ionita-Laza et al proposed an exponential weighting method within a two-stage framework. In the second stage of this approach, instead of testing top R SNPs, it tests all SNPs and weights the P-values of association test according to the information of the first stage. However, both of the top R and exponential weighting methods only use the information from the first stage to rank SNPs. It seems that the two methods do not use information from the first stage efficiently. Furthermore, it may be unreasonable for the exponential weighting method to use the same weight for all SNPs within a group when only one or a few SNPs are related with a disease. In this article, we propose a data-driven weighting scheme within a two-stage framework. In this method, we use the information from the first stage to determine a SNP-specific weight for each SNP. We use simulation studies to evaluate the performance of our method. The simulation results showed that our proposed method is consistently more powerful than the top R method and the exponential weighting method, regardless of the LD structure, population structure, and family structure.
最近,Steen 等人提出了一种新颖的基于家系的全基因组关联研究两阶段方法。在第一阶段,基于家系间信息的检验用于根据 P 值或检验的条件功效对 SNP 进行排序。在第二阶段,使用基于家系的关联检验测试最有前途的 R 个 SNP。我们称这种两阶段方法为 top R 方法。Ionita-Laza 等人在两阶段框架内提出了一种指数加权方法。在该方法的第二阶段,它不是测试 top R SNPs,而是测试所有 SNP,并根据第一阶段的信息对关联检验的 P 值进行加权。然而,top R 和指数加权方法都只使用第一阶段的信息来对 SNP 进行排序。这两种方法似乎没有有效地利用第一阶段的信息。此外,当只有一个或几个 SNP 与疾病相关时,指数加权方法对一组内的所有 SNP 使用相同的权重可能是不合理的。在本文中,我们在两阶段框架内提出了一种数据驱动的加权方案。在这种方法中,我们使用第一阶段的信息为每个 SNP 确定一个 SNP 特异性权重。我们使用模拟研究来评估我们方法的性能。模拟结果表明,无论 LD 结构、群体结构和家系结构如何,我们提出的方法始终比 top R 方法和指数加权方法更有效。