Chakraborty R, Jin L
Center for Demographic and Population Genetics, University of Texas Graduate School of Biomedical Sciences, Houston 77225.
Hum Biol. 1993 Dec;65(6):875-95.
Relatedness between individuals is an important element of genetic-epidemiological and evolutionary investigations in the context of anthropological research. In general, data on relationships between individuals are gathered from personal interviews or from examination of vital records. When blood samples are collected, such information can be validated from genotypic similarities of individuals. Although genotype data may offer opportunities to exclude certain types of relationships, inclusionary statements are necessarily only probabilistic in nature. The limitations of such probabilistic statements depend on the number of segregating alleles and the extent of polymorphisms at the loci employed. With the advent of DNA technology, several hypervariable single-locus probes (SLPs) and multilocus probes (MLPs) are now available for many organisms. These can be used to circumvent limitations of unequivocal assignment of relationships from genotype data. In this article we describe analytical principles for such investigations. In particular, we propose summary measures of DNA fingerprinting data (e.g., number of different alleles and number of shared alleles) that can be used to describe kinship relationships between individuals. We derive the expected distributions of number of alleles in individuals and of number of shared alleles between individuals of known relationships in a population. These distributions can be used in hypothesis testing to determine relatedness between individuals. We also derive the number of SLPs, each detecting a hypervariable polymorphism, needed to determine a specified relationship for given ranges of errors of prediction. Illustrations of the theory with data on several short tandem repeat loci and variable number of tandem repeat (VNTR) loci indicate that with 6 to 12 SLPs the parent-offspring pairs can be reliably distinguished from random pairs of individuals. This theory also serves the purpose of detecting inbreeding levels in a natural population.
在人类学研究背景下,个体之间的亲缘关系是遗传流行病学和进化研究的重要元素。一般来说,关于个体间关系的数据是通过个人访谈或查阅重要记录收集的。采集血样时,此类信息可通过个体的基因型相似性进行验证。尽管基因型数据可能提供排除某些类型关系的机会,但纳入性陈述本质上必然只是概率性的。此类概率性陈述的局限性取决于所使用位点上等位基因的分离数量和多态性程度。随着DNA技术的出现,现在有几种高变单基因座探针(SLP)和多基因座探针(MLP)可用于许多生物体。这些可用于规避从基因型数据明确确定关系的局限性。在本文中,我们描述了此类研究的分析原理。特别是,我们提出了DNA指纹数据的汇总指标(例如,不同等位基因的数量和共享等位基因的数量),可用于描述个体间的亲属关系。我们推导了群体中个体等位基因数量以及已知关系个体间共享等位基因数量的预期分布。这些分布可用于假设检验,以确定个体间的亲缘关系。我们还推导了为在给定预测误差范围内确定特定关系所需的、每个检测高变多态性的SLP数量。用几个短串联重复序列位点和可变串联重复序列(VNTR)位点的数据对该理论进行的说明表明,使用6至12个SLP可将亲子对与随机个体对可靠地区分开来。该理论还可用于检测自然群体中的近亲繁殖水平。