Wen S H, Hsiao C K
Department of Public Health, College of Medicine, Tzu-Chi University, Hualien, 97004, Taiwan.
Department of Public Health and Institute of Epidemiology, College of Public Health, National Taiwan University, Taipei, 100, Taiwan.
J Hum Genet. 2007;52(8):650-658. doi: 10.1007/s10038-007-0159-9. Epub 2007 Jun 30.
Multiple testing occurs commonly in genome-wide association studies with dense SNPs map. With numerous SNPs, not only the genotyping cost and time increase dramatically, many family wise error rate (FWER) controlling methods may fail for being too conservative and of less power when detecting SNPs associated with disease is of interest. Recently, several powerful two-stage strategies for multiple testing have received great attention. In this paper, we propose a grid-search algorithm for an optimal design of sample size allocation for these two-stage procedures. Two types of constraints are considered, one is the fixed overall cost and the other is the limited sample size. With the proposed optimal allocation of sample size, bearable false-positive results and larger power can be achieved to meet the limitations. The simulations indicate, as a general rule, allocating at least 80% of the total cost in stage one provides maximum power, as opposed to other methods. If per-genotyping cost in stage two differs from that in stage one, downward proportion of the total cost in earlier stage maintains good power. For limited total sample size, evaluating all the markers on 55% of the subjects in the first stage provides the maximum power while the cost reduction is approximately 43%.
在具有密集单核苷酸多态性(SNP)图谱的全基因组关联研究中,多重检验很常见。有大量的SNP时,不仅基因分型成本和时间会大幅增加,而且当关注检测与疾病相关的SNP时,许多控制家族性错误率(FWER)的方法可能会因过于保守且功效较低而失效。最近,几种强大的两阶段多重检验策略受到了广泛关注。在本文中,我们提出了一种网格搜索算法,用于对这些两阶段程序的样本量分配进行优化设计。考虑了两种类型的约束,一种是固定的总成本,另一种是有限的样本量。通过所提出的样本量优化分配,可以在满足限制条件的情况下获得可承受的假阳性结果和更大的功效。模拟结果表明,一般来说,与其他方法不同,在第一阶段分配至少80%的总成本可提供最大功效。如果第二阶段的每次基因分型成本与第一阶段不同,那么在早期阶段降低总成本的比例仍能保持良好的功效。对于有限的总样本量,在第一阶段对55%的受试者评估所有标记可提供最大功效,同时成本降低约43%。