Randall Sean M, Boyd James H, Ferrante Anna M, Bauer Jacqueline K, Semmens James B
Centre for Data Linkage, Curtin University, Kent Street, Bentley, WA 6102, Australia.
Comput Methods Programs Biomed. 2014 Jul;115(2):55-63. doi: 10.1016/j.cmpb.2014.03.008. Epub 2014 Apr 3.
Ensuring high linkage quality is important in many record linkage applications. Current methods for ensuring quality are manual and resource intensive. This paper seeks to determine the effectiveness of graph theory techniques in identifying record linkage errors. A range of graph theory techniques was applied to two linked datasets, with known truth sets. The ability of graph theory techniques to identify groups containing errors was compared to a widely used threshold setting technique. This methodology shows promise; however, further investigations into graph theory techniques are required. The development of more efficient and effective methods of improving linkage quality will result in higher quality datasets that can be delivered to researchers in shorter timeframes.
在许多记录链接应用中,确保高链接质量很重要。当前用于确保质量的方法是人工的且资源密集。本文旨在确定图论技术在识别记录链接错误方面的有效性。一系列图论技术被应用于两个带有已知真值集的链接数据集。将图论技术识别包含错误组的能力与一种广泛使用的阈值设置技术进行了比较。这种方法显示出了前景;然而,需要对图论技术进行进一步研究。开发更高效有效的提高链接质量的方法将产生更高质量的数据集,这些数据集能够在更短的时间内交付给研究人员。