Chen Wei-Min, Manichaikul Ani, Rich Stephen S
Center for Public Health Genomics, University of Virginia, Charlottesville, VA 22908, USA.
Am J Hum Genet. 2009 Sep;85(3):364-76. doi: 10.1016/j.ajhg.2009.08.003.
Recent advances in genotyping technology make it possible to utilize large-scale association analysis for disease-gene mapping. Powerful and robust family-based association methods are crucial for successful gene mapping. We propose a family-based association method, the generalized disequilibrium test (GDT), in which the genotype differences of all discordant relative pairs are utilized in assessing association within a family. The improvement of the GDT over existing methods is threefold: (1) information beyond first-degree relatives is incorporated efficiently, yielding substantial gains in power in comparison to existing tests; (2) the GDT statistic is implemented via a robust technique that does not rely on large sample theory, resulting in further power gains, especially at high levels of significance; and (3) covariates and weights based on family size are incorporated. Advantages of the GDT over existing methods are demonstrated by extensive computer simulations and by application to recently published large-scale genome-wide linkage data from the Type 1 Diabetes Genetics Consortium (T1DGC). In our simulations, the GDT consistently outperforms other tests for a common disease and frequently outperforms other tests for a rare disease; the power improvement is > 13% in 6 out of 8 extended pedigree scenarios. All of the six strongest associations identified by the GDT have been reported by other studies, whereas only three or four of these associations can be identified by existing methods. For the T1D association at gene UBASH3A, the GDT resulted in a genome-wide significance (p = 4.3 x 10(-6)), much stronger than the published significance (p = 10(-4)).
基因分型技术的最新进展使得利用大规模关联分析进行疾病基因定位成为可能。强大且稳健的基于家系的关联方法对于成功的基因定位至关重要。我们提出了一种基于家系的关联方法——广义不平衡检验(GDT),该方法利用所有不一致相对对的基因型差异来评估家系内的关联性。GDT相对于现有方法的改进体现在三个方面:(1)有效地纳入了一级亲属以外的信息,与现有检验相比,在功效上有显著提高;(2)GDT统计量通过一种不依赖大样本理论的稳健技术实现,从而进一步提高了功效,尤其是在高显著水平时;(3)纳入了基于家系大小的协变量和权重。通过广泛的计算机模拟以及应用于最近发表的来自1型糖尿病遗传协会(T1DGC)的大规模全基因组连锁数据,证明了GDT相对于现有方法的优势。在我们的模拟中,对于常见疾病,GDT始终优于其他检验,对于罕见疾病,也经常优于其他检验;在8个扩展家系场景中的6个中,功效提高超过13%。GDT确定的6个最强关联中的所有关联均已被其他研究报道,而现有方法只能识别其中的三到四个。对于基因UBASH3A处的1型糖尿病关联,GDT产生了全基因组显著性(p = 4.3×10⁻⁶),比已发表的显著性(p = 10⁻⁴)强得多。