Durham Elizabeth, Xue Yuan, Kantarcioglu Murat, Malin Bradley
Department of Biomedical Informatics, Vanderbilt University, Nashville, TN;
AMIA Annu Symp Proc. 2010 Nov 13;2010:182-6.
Federal regulations require patient data to be shared for reuse in a de-identified manner. However, disparate providers often share data on overlapping populations, such that a patient's record may be duplicated or fragmented in the de-identified repository. To perform unbiased statistical analysis in a de-identified setting, it is crucial to integrate records that correspond to the same patient. Private record linkage techniques have been developed, but most methods are based on encryption and preclude the ability to determine similarity, decreasing the accuracy of record linkage. The goal of this research is to integrate a private string comparison method that uses Bloom filters to provide an approximate match, with a medical record linkage algorithm. We evaluate the approach with 100,000 patients' identifiers and demographics from the Vanderbilt University Medical Center. We demonstrate that the private approximation method achieves sensitivity that is, on average, 3% higher than previous methods.
联邦法规要求以去标识化的方式共享患者数据以供再次使用。然而,不同的医疗服务提供者经常会在重叠人群上共享数据,以至于患者的记录在去标识化存储库中可能会被重复或碎片化。为了在去标识化环境中进行无偏差的统计分析,整合与同一患者对应的记录至关重要。已经开发了私有记录链接技术,但大多数方法基于加密,排除了确定相似度的能力,降低了记录链接的准确性。本研究的目标是将一种使用布隆过滤器提供近似匹配的私有字符串比较方法与一种医疗记录链接算法相结合。我们使用范德比尔特大学医学中心的100,000名患者的标识符和人口统计学数据对该方法进行了评估。我们证明,这种私有近似方法实现的灵敏度平均比以前的方法高3%。