Suppr超能文献

三种现成去识别工具的速度与准确性比较分析

A Comparative Analysis of Speed and Accuracy for Three Off-the-Shelf De-Identification Tools.

作者信息

Heider Paul M, Obeid Jihad S, Meystre Stéphane M

机构信息

Biomedical Informatics Center, Medical University of South Carolina, Charleston, SC.

出版信息

AMIA Jt Summits Transl Sci Proc. 2020 May 30;2020:241-250. eCollection 2020.

Abstract

A growing quantity of health data is being stored in Electronic Health Records (EHR). The free-text section of these clinical notes contains important patient and treatment information for research but also contains Personally Identifiable Information (PII), which cannot be freely shared within the research community without compromising patient confidentiality and privacy rights. Significant work has been invested in investigating automated approaches to text de-identification, the process of removing or redacting PII. Few studies have examined the performance of existing de-identification pipelines in a controlled comparative analysis. In this study, we use publicly available corpora to analyze speed and accuracy differences between three de-identification systems that can be run off-the-shelf: Amazon Comprehend Medical PHId, Clinacuity's CliniDeID, and the National Library of Medicine's Scrubber. No single system dominated all the compared metrics. NLM Scrubber was the fastest while CliniDeID generally had the highest accuracy.

摘要

越来越多的健康数据被存储在电子健康记录(EHR)中。这些临床记录的自由文本部分包含了用于研究的重要患者和治疗信息,但也包含个人身份信息(PII),在不损害患者保密性和隐私权的情况下,这些信息不能在研究社区内自由共享。人们已经投入了大量工作来研究文本去识别化的自动化方法,即去除或编辑PII的过程。很少有研究在受控的比较分析中检验现有去识别化流程的性能。在本研究中,我们使用公开可用的语料库来分析三种现成的去识别化系统之间的速度和准确性差异:亚马逊理解医疗PHId、Clinacuity的CliniDeID以及美国国立医学图书馆的Scrubber。没有一个系统在所有比较指标上都占主导地位。NLM Scrubber速度最快,而CliniDeID通常准确性最高。

相似文献

6
Patient Privacy in the Era of Big Data.大数据时代的患者隐私
Balkan Med J. 2018 Jan 20;35(1):8-17. doi: 10.4274/balkanmedj.2017.0966. Epub 2017 Sep 13.
7
Customization scenarios for de-identification of clinical notes.临床记录去识别的定制化场景。
BMC Med Inform Decis Mak. 2020 Jan 30;20(1):14. doi: 10.1186/s12911-020-1026-2.

引用本文的文献

本文引用的文献

8
Rapidly retargetable approaches to de-identification in medical records.医疗记录中快速可重新定位的去识别方法。
J Am Med Inform Assoc. 2007 Sep-Oct;14(5):564-73. doi: 10.1197/jamia.M2435. Epub 2007 Jun 28.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验