Max Delbrück Center for Molecular Medicine Berlin-Buch, Berlin, Germany.
BMC Med Genet. 2012 Jan 27;13:8. doi: 10.1186/1471-2350-13-8.
Genome-wide association studies (GWAS) provide an increasing number of single nucleotide polymorphisms (SNPs) associated with diseases. Our aim is to exploit those closely spaced SNPs in candidate regions for a deeper analysis of association beyond single SNP analysis, combining the classical stepwise regression approach with haplotype analysis to identify risk haplotypes for complex diseases.
Our proposed multi-locus stepwise regression starts with an evaluation of all pair-wise SNP combinations and then extends each SNP combination stepwise by one SNP from the region, carrying out haplotype regression in each step. The best associated haplotype patterns are kept for the next step and must be corrected for multiple testing at the end. These haplotypes should also be replicated in an independent data set. We applied the method to a region of 259 SNPs from the epidermal differentiation complex (EDC) on chromosome 1q21 of a German GWAS using a case control set (1,914 individuals) and to 268 families with at least two affected children as replication.
A 4-SNP haplotype pattern with high statistical significance in the case control set (p = 4.13 × 10(-7) after Bonferroni correction) could be identified which remained significant in the family set after Bonferroni correction (p = 0.0398). Further analysis revealed that this pattern reflects mainly the effect of the well-known FLG gene; however, a FLG-independent haplotype in case control set (OR = 1.71, 95% CI: 1.32-2.23, p = 5.6 × 10(-5)) and family set (OR = 1.68, 95% CI: 1.18-2.38, p = 2.19 × 10(-3)) could be found in addition.
Our approach is a useful tool for finding allele combinations associated with diseases beyond single SNP analysis in chromosomal candidate regions.
全基因组关联研究(GWAS)提供了越来越多与疾病相关的单核苷酸多态性(SNP)。我们的目的是利用候选区域中紧密间隔的 SNP 进行更深入的关联分析,超越单 SNP 分析,将经典逐步回归方法与单体型分析相结合,以识别复杂疾病的风险单体型。
我们提出的多基因逐步回归从评估所有 SNP 对组合开始,然后逐步从该区域添加一个 SNP 扩展每个 SNP 组合,在每个步骤中进行单体型回归。将最佳关联的单体型模式保留到下一步,并在最后进行多重检验校正。这些单体型也应该在独立数据集上进行复制。我们将该方法应用于德国 GWAS 中染色体 1q21 上的表皮分化复合物(EDC)的 259 个 SNP 区域,使用病例对照集(1914 人)和至少有两个受影响子女的 268 个家庭进行复制。
在病例对照集中,可以识别出具有高统计学意义的 4-SNP 单体型模式(Bonferroni 校正后 p = 4.13×10(-7)),在家族集中经过 Bonferroni 校正后仍然显著(p = 0.0398)。进一步分析表明,这种模式主要反映了众所周知的 FLG 基因的影响;然而,在病例对照集中还可以发现一个独立于 FLG 的单体型(OR = 1.71,95% CI:1.32-2.23,p = 5.6×10(-5))和家族集中(OR = 1.68,95% CI:1.18-2.38,p = 2.19×10(-3))。
我们的方法是一种有用的工具,可用于在染色体候选区域中发现与疾病相关的等位基因组合,超越单 SNP 分析。