Leal Suzanne M
Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas 77030, USA.
Genet Epidemiol. 2005 Nov;29(3):204-14. doi: 10.1002/gepi.20086.
Genotype error can greatly reduce the power of a genetic study. For family data, genotype error can be assessed by examining marker data for non-Mendelian inconsistencies, closely linked markers for double recombination events, and consistency of duplicate genotypes. For case-control data, duplicate samples are genotyped, and controls are tested for deviations from Hardy-Weinberg equilibrium (HWE). Duplicate samples can provide accurate estimates of genotyping error rates, unless systematic genotyping errors have occurred. Although genotyping errors can cause deviations from HWE, these deviations are usually small, and the power to detect them is low except for high rates of genotyping error and/or large sample sizes. An additional problem is that even when deviations from HWE are detected for marker loci, without additional experimentation it is not possible to unequivocally implicate genotyping error as the cause. The power and sample sizes necessary to detect deviations from HWE for single-nucleotide polymorphism (SNP) data are examined for a variety of genotyping error and pseudo-SNP models. For the majority of genotyping models examined, the power is poor to detect deviations from HWE. For example, for 1,000 controls, if an allele with a frequency of 0.1 fails to amplify for 28% of the heterozygous genotypes producing a sample error rate of 0.05, the power is 0.51 to detect a deviation from HWE at an alpha level of 0.05. On the other hand, the detection of deviations from HWE for pseudo-SNPs (paralogous and ectopic sequence variants) for the majority of models examined produces a power of >0.8 for sample sizes as small as 50 individuals.
基因型错误会极大地降低基因研究的效能。对于家系数据,可通过检查标记数据中的非孟德尔不一致性、紧密连锁标记中的双重组事件以及重复基因型的一致性来评估基因型错误。对于病例对照数据,对重复样本进行基因分型,并检测对照是否偏离哈迪-温伯格平衡(HWE)。重复样本可以提供基因分型错误率的准确估计,除非发生了系统性基因分型错误。虽然基因分型错误会导致偏离HWE,但这些偏差通常较小,除了基因分型错误率高和/或样本量较大的情况外,检测到这些偏差的效能较低。另一个问题是,即使在标记位点检测到偏离HWE的情况,如果没有额外的实验,也无法明确将基因分型错误认定为原因。针对各种基因分型错误和假单核苷酸多态性(SNP)模型,研究了检测单核苷酸多态性(SNP)数据偏离HWE所需的效能和样本量。对于所研究的大多数基因分型模型,检测偏离HWE的效能较差。例如,对于1000名对照,如果一个频率为0.1的等位基因在28%的杂合基因型中未能扩增,产生样本错误率为0.05,那么在α水平为0.05时检测偏离HWE的效能为0.51。另一方面,对于所研究的大多数模型,对于假SNP(旁系同源和异位序列变体),样本量小至50人时检测偏离HWE的效能>0.8。