Flaquer Antònia, Strauch Konstantin
Institute of Medical Informatics, Biometry and Epidemiology, Chair of Genetic Epidemiology, Ludwig-Maximilians-Universität (LMU) Munich, Germany.
BMC Res Notes. 2012 Aug 6;5:411. doi: 10.1186/1756-0500-5-411.
In the last years GWA studies have successfully identified common SNPs associated with complex diseases. However, most of the variants found this way account for only a small portion of the trait variance. This fact leads researchers to focus on rare-variant mapping with large scale sequencing, which can be facilitated by using linkage information. The question arises why linkage analysis often fails to identify genes when analyzing complex diseases. Using simulations we have investigated the power of parametric and nonparametric linkage statistics (KC-LOD, NPL, LOD and MOD scores), to detect the effect of genes responsible for complex diseases using different pedigree structures.
As expected, a small number of pedigrees with less than three affected individuals has low power to map disease genes with modest effect. Interestingly, the power decreases when unaffected individuals are included in the analysis, irrespective of the true mode of inheritance. Furthermore, we found that the best performing statistic depends not only on the type of pedigrees but also on the true mode of inheritance.
When applied in a sensible way linkage is an appropriate and robust technique to map genes for complex disease. Unlike association analysis, linkage analysis is not hampered by allelic heterogeneity. So, why does linkage analysis often fail with complex diseases? Evidently, when using an insufficient number of small pedigrees, one might miss a true genetic linkage when actually a real effect exists. Furthermore, we show that the test statistic has an important effect on the power to detect linkage as well. Therefore, a linkage analysis might fail if an inadequate test statistic is employed. We provide recommendations regarding the most favorable test statistics, in terms of power, for a given mode of inheritance and type of pedigrees under study, in order to reduce the probability to miss a true linkage.
近年来,全基因组关联研究(GWA)已成功识别出与复杂疾病相关的常见单核苷酸多态性(SNP)。然而,通过这种方式发现的大多数变异仅占性状变异的一小部分。这一事实促使研究人员将重点转向利用大规模测序进行罕见变异定位,而利用连锁信息可促进这一过程。问题在于,为何在分析复杂疾病时,连锁分析常常无法识别出相关基因。我们通过模拟研究了参数和非参数连锁统计量(KC-LOD、NPL、LOD和MOD分数)在使用不同家系结构检测复杂疾病相关基因效应方面的效能。
正如预期的那样,少量受累个体少于三人的家系定位中等效应疾病基因的效能较低。有趣的是,无论真实遗传模式如何,当分析中纳入未受累个体时,效能都会降低。此外,我们发现表现最佳的统计量不仅取决于家系类型,还取决于真实遗传模式。
以合理方式应用时,连锁分析是定位复杂疾病相关基因的一种合适且稳健的技术。与关联分析不同,连锁分析不受等位基因异质性的影响。那么,为何连锁分析在复杂疾病研究中常常失败呢?显然,当使用数量不足的小家系时,即使实际存在真正的遗传连锁,也可能会错过。此外,我们还表明检验统计量对检测连锁的效能也有重要影响。因此,如果采用不适当的检验统计量,连锁分析可能会失败。我们针对给定的遗传模式和所研究家系类型,就效能方面最有利的检验统计量提供了建议,以降低错过真正连锁的概率。