Regenstrief Institute, Inc, Center for Biomedical Informatics, Indianapolis, Indiana, USA.
School of Medicine, Department of Family Medicine, Indiana University, Indianapolis, Indiana, USA.
J Am Med Inform Assoc. 2019 May 1;26(5):447-456. doi: 10.1093/jamia/ocy191.
This study evaluated the degree to which recommendations for demographic data standardization improve patient matching accuracy using real-world datasets.
We used 4 manually reviewed datasets, containing a random selection of matches and nonmatches. Matching datasets included health information exchange (HIE) records, public health registry records, Social Security Death Master File records, and newborn screening records. Standardized fields including last name, telephone number, social security number, date of birth, and address. Matching performance was evaluated using 4 metrics: sensitivity, specificity, positive predictive value, and accuracy.
Standardizing address was independently associated with improved matching sensitivities for both the public health and HIE datasets of approximately 0.6% and 4.5%. Overall accuracy was unchanged for both datasets due to reduced match specificity. We observed no similar impact for address standardization in the death master file dataset. Standardizing last name yielded improved matching sensitivity of 0.6% for the HIE dataset, while overall accuracy remained the same due to a decrease in match specificity. We noted no similar impact for other datasets. Standardizing other individual fields (telephone, date of birth, or social security number) showed no matching improvements. As standardizing address and last name improved matching sensitivity, we examined the combined effect of address and last name standardization, which showed that standardization improved sensitivity from 81.3% to 91.6% for the HIE dataset.
Data standardization can improve match rates, thus ensuring that patients and clinicians have better data on which to make decisions to enhance care quality and safety.
本研究通过使用真实数据集,评估了人口统计学数据标准化建议对提高患者匹配准确性的程度。
我们使用了 4 个经过人工审核的数据集,其中包含随机选择的匹配和不匹配记录。匹配数据集包括健康信息交换(HIE)记录、公共卫生注册记录、社会保障死亡主文件记录和新生儿筛查记录。标准化字段包括姓氏、电话号码、社会安全号码、出生日期和地址。使用 4 个指标评估匹配性能:敏感性、特异性、阳性预测值和准确性。
标准化地址独立地与公共卫生和 HIE 数据集的匹配敏感性提高有关,约为 0.6%和 4.5%。由于匹配特异性降低,两个数据集的总体准确性保持不变。我们在死亡主文件数据集中没有观察到地址标准化的类似影响。姓氏标准化使 HIE 数据集的匹配敏感性提高了 0.6%,而由于匹配特异性降低,总体准确性保持不变。我们注意到其他数据集没有类似的影响。标准化其他单个字段(电话、出生日期或社会安全号码)没有显示出匹配改进。由于标准化地址和姓氏提高了匹配敏感性,我们检查了地址和姓氏标准化的综合效果,结果表明标准化使 HIE 数据集的敏感性从 81.3%提高到 91.6%。
数据标准化可以提高匹配率,从而确保患者和临床医生能够更好地利用数据做出决策,以提高护理质量和安全性。