Suppr超能文献

模拟数据集的结果:概率记录链接优于确定性记录链接。

Results from simulated data sets: probabilistic record linkage outperforms deterministic record linkage.

机构信息

Department of Medical Informatics, Academic Medical Center, University of Amsterdam, 1100 DE Amsterdam, The Netherlands.

出版信息

J Clin Epidemiol. 2011 May;64(5):565-72. doi: 10.1016/j.jclinepi.2010.05.008. Epub 2010 Oct 16.

Abstract

OBJECTIVE

To gain insight into the performance of deterministic record linkage (DRL) vs. probabilistic record linkage (PRL) strategies under different conditions by varying the frequency of registration errors and the amount of discriminating power.

STUDY DESIGN AND SETTING

A simulation study in which data characteristics were varied to create a range of realistic linkage scenarios. For each scenario, we compared the number of misclassifications (number of false nonlinks and false links) made by the different linking strategies: deterministic full, deterministic N-1, and probabilistic.

RESULTS

The full deterministic strategy produced the lowest number of false positive links but at the expense of missing considerable numbers of matches dependent on the error rate of the linking variables. The probabilistic strategy outperformed the deterministic strategy (full or N-1) across all scenarios. A deterministic strategy can match the performance of a probabilistic approach providing that the decision about which disagreements should be tolerated is made correctly. This requires a priori knowledge about the quality of all linking variables, whereas this information is inherently generated by a probabilistic strategy.

CONCLUSION

PRL is more flexible and provides data about the quality of the linkage process that in turn can minimize the degree of linking errors, given the data provided.

摘要

目的

通过改变注册错误的频率和辨别能力的大小,深入了解确定性记录链接(DRL)与概率性记录链接(PRL)策略在不同条件下的表现。

研究设计与设置

这是一项模拟研究,通过改变数据特征来创建一系列现实的链接场景。对于每个场景,我们比较了不同链接策略(确定性完全、确定性 N-1 和概率性)所产生的错误分类数量(错误的非链接和错误的链接数量):确定性完全、确定性 N-1 和概率性。

结果

完全确定性策略产生的假阳性链接数量最少,但代价是根据链接变量的错误率错过了相当数量的匹配。在所有场景中,概率性策略都优于确定性策略(完全或 N-1)。只要正确做出关于应容忍哪些分歧的决策,确定性策略就可以匹配概率方法的性能。这需要事先了解所有链接变量的质量,而这一信息是由概率性策略固有地生成的。

结论

PRL 更灵活,并提供有关链接过程质量的数据,从而可以根据提供的数据最大限度地减少链接错误的程度。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验