Channing Laboratories, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA.
Am J Hum Genet. 2010 Apr 9;86(4):573-80. doi: 10.1016/j.ajhg.2010.02.019. Epub 2010 Mar 25.
Large numbers of control individuals with genome-wide genotype data are now available through various databases. These controls are regularly used in case-control genome-wide association studies (GWAS) to increase the statistical power. Controls are often "unselected" for the disease of interest and are not matched to cases in terms of confounding factors, making the studies more vulnerable to confounding as a result of population stratification. In this communication, we demonstrate that family-based designs can integrate unselected controls from other studies into the analysis without compromising the robustness of family-based designs against genetic confounding. The result is a hybrid case-control family-based analysis that achieves higher power levels than population-based studies with the same number of cases and controls. This strategy is widely applicable and works ideally for all situations in which both family and case-control data are available. The approach consists of three steps. First, we perform a standard family-based association test that does not utilize the between-family component. Second, we use the between-family information in conjunction with the genotypes from unselected controls in a Cochran-Armitage trend test. The p values from this step are then calculated by rank ordering the individual Cochran-Armitage trend test statistics for the genotype markers. Third, we generate a combined p value with the association p values from the first two steps. Simulation studies are used to assess the achievable power levels of this method compared to standard analysis approaches. We illustrate the approach by an application to a GWAS of attention deficit hyperactivity disorder parent-offspring trios and publicly available controls.
现在,通过各种数据库,可以获得大量具有全基因组基因型数据的对照个体。这些对照个体经常被用于病例对照全基因组关联研究(GWAS)中,以提高统计功效。对照个体通常是“非选择性的”,即与感兴趣的疾病无关,也没有在混杂因素方面与病例相匹配,因此由于群体分层,研究更容易受到混杂因素的影响。在本通讯中,我们证明了基于家系的设计可以将来自其他研究的未选择对照个体整合到分析中,而不会影响基于家系设计对遗传混杂的稳健性。结果是一种混合病例对照基于家系的分析方法,与具有相同病例和对照数量的基于人群的研究相比,实现了更高的功效水平。这种策略具有广泛的适用性,并且在所有有基于家系和病例对照数据的情况下都能理想地工作。该方法包括三个步骤。首先,我们进行标准的基于家系的关联测试,不利用家系间成分。其次,我们结合未选择对照个体的基因型,使用家系间信息进行 Cochran-Armitage 趋势检验。然后,通过对个体 Cochran-Armitage 趋势检验统计量进行排序,计算此步骤的 p 值。第三步,我们生成一个联合 p 值,将前两步的关联 p 值合并。通过模拟研究评估了与标准分析方法相比,该方法的可实现功效水平。我们通过对注意缺陷多动障碍父母-子女三体型和公开可用对照的 GWAS 的应用来说明该方法。