Victor T W, Mera R M
Healthcare Informatics, GlaxoSmithKline, Collegeville, PA 19426-2990, USA.
Stud Health Technol Inform. 2001;84(Pt 2):1409-13.
Limitations of current record linkage techniques include difficulty in handling large and heterogeneous data sets, low sensitivity in deterministic matching, the necessity to provide a priori weights for probabilistic matching, low computational efficiency, and complex software. This paper provides a detailed description of a method developed for purposes of linking records of individuals across time and geography. The procedure for record-linkage consists of three major components: data standardization, weight estimation, and matching. The proposed method was designed to incorporate a combination of exact and probabilistic matching techniques. The procedure was validated using convergent, divergent, and criterion- validity measures. The authors feel that the procedure outlined in this paper is a first step in addressing the current trend toward larger and more complex databases.
当前记录链接技术的局限性包括处理大型异构数据集存在困难、确定性匹配的灵敏度低、需要为概率匹配提供先验权重、计算效率低以及软件复杂。本文详细描述了一种为跨时间和地域链接个人记录而开发的方法。记录链接程序由三个主要部分组成:数据标准化、权重估计和匹配。所提出的方法旨在结合精确匹配和概率匹配技术。该程序使用收敛性、发散性和标准效度测量进行了验证。作者认为本文概述的程序是应对当前数据库越来越大且越来越复杂这一趋势的第一步。