Division of Biostatistics, Indiana University School of Medicine, Indianapolis, USA.
Stat Methods Med Res. 2013 Feb;22(1):31-8. doi: 10.1177/0962280211403600. Epub 2011 Jun 10.
We review ideas, approaches and progress in the field of record linkage. We point out that the latent class models used in probabilistic matching have been well developed and applied in a different context of diagnostic testing when the true disease status is unknown. The methodology developed in the diagnostic testing setting can be potentially translated and applied in record linkage. Although there are many methods for record linkage, a comprehensive evaluation of methods for a wide range of real-world data with different data characteristics and with true match status is absent due to lack of data sharing. However, the recent availability of generators of synthetic data with realistic characteristics renders such evaluations feasible.
我们回顾了记录链接领域的思想、方法和进展。我们指出,在真实疾病状态未知的诊断测试背景下,用于概率匹配的潜在类别模型已经得到了很好的发展和应用。在诊断测试环境中开发的方法学可以潜在地转化并应用于记录链接。尽管记录链接有许多方法,但由于缺乏数据共享,对于具有不同数据特征和真实匹配状态的广泛真实数据,没有对方法进行全面评估。然而,最近具有真实特征的合成数据生成器的出现使得这种评估成为可能。