Department of Biostatistics, University of Washington, Seattle, 98195-7720, USA.
Genet Epidemiol. 2009 Dec;33(8):668-78. doi: 10.1002/gepi.20418.
Genome-wide association studies result in inflated false-positive results when unrecognized cryptic relatedness exists. A number of methods have been proposed for testing association between markers and disease with a correction for known pedigree-based relationships. However, in most case-control studies, relationships are generally unknown, yet the design is predicated on the assumption of at least ancestral relatedness among cases. Here, we focus on adjusting cryptic relatedness when the genealogy of the sample is unknown, particularly in the context of samples from isolated populations where cryptic relatedness may be problematic. We estimate cryptic relatedness using maximum-likelihood methods and use a corrected chi(2) test with estimated kinship coefficients for testing in the context of unknown cryptic relatedness. Estimated kinship coefficients characterize precisely the relatedness between truly related people, but are biased for unrelated pairs. The proposed test substantially reduces spurious positive results, producing a uniform null distribution of P-values. Especially with missing pedigree information, estimated kinship coefficients can still be used to correct non-independence among individuals. The corrected test was applied to real data sets from genetic isolates and created a distribution of P-value that was close to uniform. Thus, the proposed test corrects the non-uniform distribution of P-values obtained with the uncorrected test and illustrates the advantage of the approach on real data.
全基因组关联研究在存在未被识别的隐性亲缘关系时会导致虚假阳性结果。已经提出了许多用于测试标记与疾病之间关联的方法,这些方法对基于已知家系关系进行了校正。然而,在大多数病例对照研究中,关系通常是未知的,但设计是基于病例之间至少存在祖先亲缘关系的假设。在这里,我们专注于调整未知样本谱系中的隐性亲缘关系,特别是在来自隔离人群的样本中,隐性亲缘关系可能是一个问题。我们使用最大似然法估计隐性亲缘关系,并在未知隐性亲缘关系的情况下使用校正后的亲缘系数进行 chi(2)检验。估计的亲缘系数准确地描述了真正相关人群之间的关系,但对于不相关的对存在偏差。拟议的检验大大减少了虚假阳性结果,产生了均匀的 P 值零分布。特别是在缺失家系信息的情况下,估计的亲缘系数仍然可以用于校正个体之间的非独立性。校正后的检验应用于来自遗传隔离体的真实数据集,并创建了接近均匀的 P 值分布。因此,所提出的检验校正了未校正检验获得的非均匀 P 值分布,并说明了该方法在真实数据上的优势。