Lathrop G M, Hooper A B, Huntsman J W, Ward R H
Am J Hum Genet. 1983 Mar;35(2):241-62.
Pedigrees used in the analysis of genetic or medical data are usually ascertained from sources subject to a variety of errors including misidentification of individuals, faults in historical documents or record linkage, nonpaternity, and unidentified adoption. Genetic markers can be used to verify putative family and pedigree data through the search for inconsistencies, or genetic exclusions, between putative parents and offspring. The probability of observing an exclusion given the occurrence of an error depends upon the gene frequencies at the loci under study and the forms of error. In addition, inconsistencies can arise from laboratory errors in marker determination. Together, these problems make the proper statistical analysis of such data desirable. Here we give a model that specifies the combined effects of various kinds of pedigree error along with genetic marker error. This model allows the maximum-likelihood estimation of the rates of various forms of pedigree error and laboratory error from genetic marker data collected on putative families. The method is illustrated by applying it to data obtained from a South Pacific island population, Tokelau. From the observed distribution of genetic marker inconsistencies between the parents and offspring of putative families, derived from the extensive genealogy of this population, we are able to estimate that the error of a paternal link is 4%, the error of a maternal link is zero, and the overall system typing error is 1%.
用于遗传或医学数据分析的系谱通常是从可能存在各种错误的来源确定的,这些错误包括个体身份识别错误、历史文献或记录链接中的错误、非父系关系以及未识别的收养情况。遗传标记可用于通过查找推定父母与后代之间的不一致或遗传排除来验证推定的家庭和系谱数据。给定错误发生时观察到排除的概率取决于所研究基因座的基因频率以及错误的形式。此外,标记测定中的实验室错误也可能导致不一致。总之,这些问题使得对这类数据进行适当的统计分析变得很有必要。在此,我们给出一个模型,该模型规定了各种系谱错误以及遗传标记错误的综合影响。这个模型允许从在推定家庭中收集的遗传标记数据中对各种形式的系谱错误和实验室错误发生率进行最大似然估计。通过将该方法应用于从南太平洋岛屿托克劳人群获得的数据来说明该方法。从这个人群广泛的家谱中得出的推定家庭父母与后代之间遗传标记不一致的观察分布,我们能够估计父系链接的错误率为4%,母系链接的错误率为零,总体系统分型错误率为1%。