James I, McKinnon E, Gaudieri S, Morahan G
Centre for Clinical Immunology and Biomedical Statistics, Murdoch University and Royal Perth Hospital, Perth, Western Australia, Australia.
Diabetes Obes Metab. 2009 Feb;11 Suppl 1(Suppl 1):101-7. doi: 10.1111/j.1463-1326.2008.01010.x.
The absence or 'missingness' of single nucleotide polymorphism (SNP) assay values because of genotype or related factors of interest may bias association and other studies. Missingness was determined for the Type 1 Diabetes Genetics Consortium (T1DGC) Major Histocompatibility Complex (MHC) data and was found to vary across the region, ranging up to 11.1% of the non-null proband SNPs, with a median of 0.3%. We consider factors related to missingness in the T1DGC data and briefly assess its possible influence on association studies.
We assessed associations of missingness in the SNP assay data with human leucocyte antigen (HLA) genotype of the individual and with SNP genotypes of the parents. Within-cohort analyses were combined (over all cohorts) using (i) Mantel-Haenszel tests for two-by-two tables or (ii) by combining test statistics for larger tables and regression models. Mixed effect regression models were used to assess association of the SNP genotypes with affected status of the offspring after adjustment for parental SNP genotypes, cohort membership and HLA genotypes. Log-linear models were used to assess association of missingness in the unaffected sib assays with SNP genotypes of the probands.
Missingness of SNP values near the HLA class I (A, B and C) and class II (DR, DQ and DP) loci is strongly associated with carriage of corresponding HLA genotypes within these groups. Similar associations pertain to missing values among the microsatellite data. In at least some of these cases, regions of missingness coincided with known deletion regions corresponding to the associated HLA haplotype. We conjecture that other regions of associated missingness may point to similar haplotypic deletions. Analysis of association patterns of SNP genotypes with affected status of offspring does not indicate strong informative missingness. However, association of missingness in proband data with parental SNP genotypes may impact transmission disequilibrium test (TDT)-type analyses. Comparisons of affected and unaffected siblings point to possible susceptibility regions additional to the classical HLA-DR3/4 alleles near BAT4-LY6G5B-BAT5 and NOTCH4.
Potentially informative missingness in SNP assay values in the MHC region may impact on association and related analyses based on the T1DGC data. These results suggest that it would be prudent to assess the degree to which missingness may abrogate assessed SNP disease markers in such studies. Initial analyses based on comparison of affected and unaffected status in offspring suggest that at least these may be little affected.
由于感兴趣的基因型或相关因素导致单核苷酸多态性(SNP)检测值缺失,可能会使关联研究及其他研究产生偏差。对1型糖尿病遗传学联盟(T1DGC)主要组织相容性复合体(MHC)数据的缺失情况进行了测定,发现该区域内缺失情况各不相同,在非空先证者SNP中,缺失率高达11.1%,中位数为0.3%。我们考虑了T1DGC数据中与缺失相关的因素,并简要评估了其对关联研究可能产生的影响。
我们评估了SNP检测数据中的缺失情况与个体的人类白细胞抗原(HLA)基因型以及父母的SNP基因型之间的关联。通过以下方式对队列内分析进行合并(在所有队列中):(i)对二乘二表格使用曼特尔 - 亨泽尔检验,或(ii)对更大的表格合并检验统计量及回归模型。使用混合效应回归模型,在调整父母SNP基因型、队列成员身份和HLA基因型后,评估SNP基因型与后代患病状态之间的关联。使用对数线性模型评估未患病同胞检测中缺失情况与先证者SNP基因型之间的关联。
HLA I类(A、B和C)和II类(DR、DQ和DP)基因座附近SNP值的缺失与这些组内相应HLA基因型的携带情况密切相关。微卫星数据中的缺失值也存在类似关联。在至少一些此类情况下,缺失区域与对应相关HLA单倍型的已知缺失区域重合。我们推测,其他相关缺失区域可能指向类似的单倍型缺失。对SNP基因型与后代患病状态关联模式的分析未表明存在强烈的信息性缺失。然而,先证者数据中的缺失情况与父母SNP基因型的关联可能会影响传递不平衡检验(TDT)类型的分析。对患病和未患病同胞的比较表明,除了BAT4 - LY6G5B - BAT5和NOTCH4附近的经典HLA - DR3/4等位基因外,可能还存在其他易感区域。
MHC区域SNP检测值中潜在的信息性缺失可能会影响基于T1DGC数据的关联及相关分析。这些结果表明,在这类研究中,谨慎评估缺失可能消除所评估的SNP疾病标志物的程度是明智的。基于对后代患病和未患病状态比较的初步分析表明,至少这些可能受影响较小。