Devlin B, Risch N
Department of Epidemiology and Public Health, Yale University School of Medicine, New Haven, CT 06510.
Am J Hum Genet. 1992 Sep;51(3):534-48.
Allele-rich VNTR loci provide valuable information for forensic inference. Interpretation of this information is complicated by measurement error, which renders discrete alleles difficult to distinguish. Two methods have been used to circumvent this difficulty--i.e., binning methods and direct evaluation of allele frequencies, the latter achieved by modeling the data as a mixture distribution. We use this modeling approach to estimate the allele frequency distributions for two loci--D17S79 and D2S44--for black, Caucasian, and Hispanic samples from the Lifecodes and FBI data bases. The data bases are differentiated by the restriction enzyme used: PstI (Lifecodes) and HaeIII (FBI). Our results show that alleles common in one ethnic group are almost always common in all ethnic groups, and likewise for rare alleles; this pattern holds for both loci. Gene diversity, or heterozygosity, measured as one minus the sum of the squared allele frequencies, is greater for D2S44 than for D17S79, in both data bases. The average gene diversity across ethnic groups when PstI (HaeIII) is used is .918 (.918) for D17S79 and is .985 (.983) for D2S44. The variance in gene diversity among ethnic groups is greater for D17S79 than for D2S44. The number of alleles, like the gene diversity, is greater for D2S44 than for D17S79. The mean numbers of alleles across ethnic groups, estimated from the PstI (HaeIII) data, are 40.25 (41.5) for D17S79 and 104 (103) for D2S44. The number of alleles is correlated with sample size. We use the estimated allele frequency distributions for each ethnic group to explore the effects of unwittingly mixing populations and thereby violating independence assumptions. We show that, even in extreme cases of mixture, the estimated genotype probabilities are good estimates of the true probabilities, contradicting recent claims. Because the binning methods currently used for forensic inference show even less differentiation among ethnic groups, we conclude that mixture has little or no impact on the use of VNTR loci for forensics.
富含等位基因的可变数目串联重复序列(VNTR)位点为法医推断提供了有价值的信息。测量误差使这些信息的解读变得复杂,它使得离散的等位基因难以区分。有两种方法被用来规避这一困难——即分箱法和直接评估等位基因频率,后者是通过将数据建模为混合分布来实现的。我们使用这种建模方法来估计来自Lifecodes和联邦调查局(FBI)数据库的黑人、白人和西班牙裔样本中两个位点——D17S79和D2S44的等位基因频率分布。这两个数据库因所使用的限制酶而有所不同:PstI(Lifecodes)和HaeIII(FBI)。我们的结果表明,在一个种族群体中常见的等位基因在所有种族群体中几乎总是常见的,罕见等位基因也是如此;这种模式在两个位点上都成立。以1减去等位基因频率平方和来衡量的基因多样性,即杂合度,在两个数据库中,D2S44都比D17S79更高。当使用PstI(HaeIII)时,D17S79在各民族中的平均基因多样性为0.918(0.918),D2S44为0.985(0.983)。D17S79在各民族间基因多样性的方差比D2S44更大。与基因多样性一样,D2S44的等位基因数量比D17S79更多。根据PstI(HaeIII)数据估计,D17S79在各民族中的平均等位基因数量为40.25(41.5),D2S44为104(103)。等位基因数量与样本大小相关。我们使用每个种族群体的估计等位基因频率分布来探讨无意中混合群体从而违反独立性假设的影响。我们表明,即使在极端的混合情况下,估计的基因型概率也是对真实概率的良好估计,这与最近的说法相矛盾。由于目前用于法医推断的分箱法在各民族间显示出的差异更小,我们得出结论,混合对VNTR位点在法医中的应用几乎没有影响。