Division of Cancer Epidemiology, Unit of Genetic Epidemiology, German Cancer Research Center DKFZ, Im Neuenheimer Feld 280, Heidelberg, Germany.
Genet Epidemiol. 2010 May;34(4):354-63. doi: 10.1002/gepi.20491.
Haplotype sharing analysis is a well-established option for the investigation of the etiology of complex diseases. The statistical power of haplotype association methods depends strongly on how the information of unobserved haplotypes can be captured by multilocus genotypes. In this study we combine an entropy-based marker selection algorithm (EMS), with a haplotype sharing-based Mantel statistics into a new algorithm. Genetic markers are iteratively selected by their multilocus linkage disequilibrium (LD), which is assessed by a normalized entropy difference. The initial marker set is gradually enlarged to increase the available information on the amount of sharing around a potential susceptibility marker. Markers are rejected from joint phasing if they do not increase the multilocus LD. In simulated candidate gene studies, the Mantel statistics combined with the new EMS performs as well or better at detecting the disease single nucleotide polymorphism-or in indirect association analysis its flanking markers-than the Mantel statistics without selection of markers prior to haplotype estimation and the Mantel statistics using sliding windows of size five. It is therefore appealing to apply our selection approach for haplotype-based association analysis, since marker selection driven by the observed data avoids both the arbitrary choice of markers when using a fixed window size, as well as the estimation of haplotype block structure.
单体型共享分析是研究复杂疾病病因的一种成熟方法。单体型关联方法的统计功效在很大程度上取决于如何通过多位点基因型来捕获未观察到的单体型信息。在这项研究中,我们将基于熵的标记选择算法(EMS)与基于单体型共享的Mantel 统计相结合,形成了一种新的算法。通过归一化熵差来评估多态性之间的连锁不平衡(LD),从而对遗传标记进行迭代选择。通过逐步扩大初始标记集,增加潜在易感标记周围共享数量的可用信息,来增加标记。如果联合相位分析中排除了不增加多态性 LD 的标记,则会被拒绝。在模拟候选基因研究中,与不进行标记选择的单体型估计和使用大小为 5 的滑动窗口的 Mantel 统计相比,Mantel 统计与新的 EMS 相结合在检测疾病单核苷酸多态性或间接关联分析中其侧翼标记方面表现得更好或相同。因此,我们希望将这种选择方法应用于基于单体型的关联分析,因为基于观察数据的标记选择可以避免使用固定窗口大小时标记的任意选择,以及单体型块结构的估计。