Cook L J, Olson L M, Dean J M
Intermountain Injury Control Research Center, Department of Pediatrics, University of Utah School of Medicine, Salt Lake City 84108, USA.
Methods Inf Med. 2001 Jul;40(3):196-203.
This study investigates relationships between file sizes, amounts of information contained in commonly used record linkage variables, and the amount of information needed for a successful probabilistic linkage project. We present an equation predicting the amount of information needed for a successful linkage project. Match weights for variables commonly used in record linkage are measured using artificially created databases. Linkage algorithms were successful when the sum of minimum weights for variables used in a linkage exceeded the predicted cutoff. Linkage results were acceptable when this sum was near the predicted cutoff. This technique enables researchers to determine if enough information exists to perform a successful probabilistic linkage.
本研究调查了文件大小、常用记录链接变量中包含的信息量以及成功进行概率性链接项目所需的信息量之间的关系。我们提出了一个预测成功链接项目所需信息量的方程。使用人工创建的数据库来测量记录链接中常用变量的匹配权重。当链接中使用的变量的最小权重之和超过预测的临界值时,链接算法成功。当这个和接近预测的临界值时,链接结果是可接受的。这种技术使研究人员能够确定是否存在足够的信息来进行成功的概率性链接。