Queensland Institute of Medical Research, Brisbane, Australia.
Eur J Hum Genet. 2012 Jun;20(6):668-74. doi: 10.1038/ejhg.2011.257. Epub 2012 Jan 18.
Disorders that share genetic risk factors often are placed in closely related diagnostic categories and treated similarly. Until recently, evidence for shared genetic etiology derived from classical research strategies--coaggregation in family and twin studies. Accumulating sufficient numbers of families was often problematic. However, in the era of genome-wide genotyping, we can now directly estimate the degree of sharing of genetic risk factors between disorders. This strategy is practical even for very rare disorders, where it is infeasible to ascertain informative families. Importantly, the estimates of genetic correlations from genome-wide genotypes are derived using such distant relatives that contamination by shared environmental factors seems unlikely. However, any method that seeks to quantify the shared etiology of disorders assumes they can be distinguished diagnostically from one another without error. Here we investigate the impact of misdiagnosis on estimates of genetic correlation both from traditional family data and from genome-wide genotypes of case-control samples from unrelated individuals. Our analyses show similar results for levels of misdiagnosis in both types of data. In both scenarios, genetic variances and heritabilities tend to be slightly underestimated but genetic correlations are overestimated, sometimes substantially so. For example, two genetically distinct but equally heritable disorders each with prevalence 1%, can generate false-positive estimates of genetic correlations of >0.2 in the presence of 10% reciprocal misdiagnosis. Strategies for minimizing the effects of misdiagnosis in cross-disorder genetic studies are discussed.
具有共同遗传风险因素的疾病通常被归入密切相关的诊断类别,并采用类似的治疗方法。直到最近,来自经典研究策略——家族和双胞胎研究中的共聚集——的共同遗传病因证据,才被人们所接受。积累足够数量的家族通常存在问题。然而,在全基因组基因分型的时代,我们现在可以直接估计疾病之间遗传风险因素的共享程度。即使对于非常罕见的疾病,这种策略也是可行的,因为确定有信息的家族是不切实际的。重要的是,来自全基因组基因型的遗传相关性估计是使用如此遥远的亲属得出的,因此不太可能受到共同环境因素的污染。然而,任何试图量化疾病共同病因的方法都假设可以在不犯错的情况下将它们彼此区分开来。在这里,我们研究了误诊对来自传统家族数据和来自无关个体病例对照样本的全基因组基因型的遗传相关性估计的影响。我们的分析表明,这两种类型的数据中的误诊水平相似。在这两种情况下,遗传方差和遗传力往往被低估,但遗传相关性被高估,有时甚至是大幅高估。例如,两种遗传上不同但具有相同遗传性的疾病,每种疾病的患病率为 1%,在存在 10%相互误诊的情况下,可能会产生遗传相关性的假阳性估计值>0.2。讨论了在跨疾病遗传研究中最小化误诊影响的策略。