Kling Daniel, Egeland Thore, Piñero Mariana Herrera, Vigeland Magnus Dehli
Department of Forensic Services, Oslo University Hospital, Oslo, Norway.
IKBM, Norwegian University of Life Sciences, Ås, Norway.
Forensic Sci Int Genet. 2017 Nov;31:57-66. doi: 10.1016/j.fsigen.2017.08.006. Epub 2017 Aug 12.
Methods and implementations of DNA-based identification are well established in several forensic contexts. However, assessing the statistical power of these methods has been largely overlooked, except in the simplest cases. In this paper we outline general methods for such power evaluation, and apply them to a large set of family reunification cases, where the objective is to decide whether a person of interest (POI) is identical to the missing person (MP) in a family, based on the DNA profile of the POI and available family members. As such, this application closely resembles database searching and disaster victim identification (DVI). If parents or children of the MP are available, they will typically provide sufficient statistical evidence to settle the case. However, if one must resort to more distant relatives, it is not a priori obvious that a reliable conclusion is likely to be reached. In these cases power evaluation can be highly valuable, for instance in the recruitment of additional family members. To assess the power in an identification case, we advocate the combined use of two statistics: the Probability of Exclusion, and the Probability of Exceedance. The former is the probability that the genotypes of a random, unrelated person are incompatible with the available family data. If this is close to 1, it is likely that a conclusion will be achieved regarding general relatedness, but not necessarily the specific relationship. To evaluate the ability to recognize a true match, we use simulations to estimate exceedance probabilities, i.e. the probability that the likelihood ratio will exceed a given threshold, assuming that the POI is indeed the MP. All simulations are done conditionally on available family data. Such conditional simulations have a long history in medical linkage analysis, but to our knowledge this is the first systematic forensic genetics application. Also, for forensic markers mutations cannot be ignored and therefore current models and implementations must be extended. All the tools are freely available in Familias (http://www.familias.no) empowered by the R library paramlink. The above approach is applied to a large and important data set: 'The missing grandchildren of Argentina'. We evaluate the power of 196 families from the DNA reference databank (Banco Nacional de Datos Genéticos, http://www.bndg.gob.ar. As a result we show that 58 of the families have poor statistical power and require additional genetic data to enable a positive identification.
基于DNA的识别方法和实施在多个法医领域已得到充分确立。然而,除了最简单的情况外,对这些方法的统计效力评估在很大程度上被忽视了。在本文中,我们概述了此类效力评估的通用方法,并将其应用于大量家庭团聚案例中,这些案例的目的是根据相关人员(POI)和现有家庭成员的DNA图谱,判定该相关人员是否与家庭中失踪人员(MP)身份相同。因此,这种应用与数据库搜索和灾难受害者身份识别(DVI)极为相似。如果失踪人员的父母或子女可用,他们通常会提供足够的统计证据来解决问题。然而,如果必须求助于更远的亲属,能否得出可靠结论在一开始并不明显。在这些情况下,效力评估可能非常有价值,例如在招募更多家庭成员时。为了评估身份识别案例中的效力,我们提倡联合使用两种统计量:排除概率和超越概率。前者是指一个随机、无亲属关系的人的基因型与现有家庭数据不兼容的概率。如果这个概率接近1,很可能会得出关于一般亲属关系的结论,但不一定是具体关系。为了评估识别真正匹配的能力,我们使用模拟来估计超越概率,即假设相关人员确实是失踪人员时,似然比超过给定阈值的概率。所有模拟都是在现有家庭数据的条件下进行的。这种条件模拟在医学连锁分析中有很长的历史,但据我们所知,这是首次在法医遗传学中的系统应用。此外,对于法医标记,突变不能被忽视,因此当前的模型和实施方法必须扩展。所有工具均可在由R库paramlink支持的Familias(http://www.familias.no)中免费获取。上述方法应用于一个庞大且重要的数据集:“阿根廷失踪的孙辈”。我们评估了来自DNA参考数据库(国家遗传数据库,http://www.bndg.gob.ar)的196个家庭的效力。结果表明,其中58个家庭的统计效力较差,需要额外的遗传数据才能进行肯定的身份识别。