Suppr超能文献

用于多中心研究的电子健康记录数据去识别和匿名化策略。

Strategies for de-identification and anonymization of electronic health record data for use in multicenter research studies.

机构信息

Stanford Sleep Medicine Center, Redwood City, CA 94063-5704, USA.

出版信息

Med Care. 2012 Jul;50 Suppl(Suppl):S82-101. doi: 10.1097/MLR.0b013e3182585355.

Abstract

BACKGROUND

De-identification and anonymization are strategies that are used to remove patient identifiers in electronic health record data. The use of these strategies in multicenter research studies is paramount in importance, given the need to share electronic health record data across multiple environments and institutions while safeguarding patient privacy.

METHODS

Systematic literature search using keywords of de-identify, deidentify, de-identification, deidentification, anonymize, anonymization, data scrubbing, and text scrubbing. Search was conducted up to June 30, 2011 and involved 6 different common literature databases. A total of 1798 prospective citations were identified, and 94 full-text articles met the criteria for review and the corresponding articles were obtained. Search results were supplemented by review of 26 additional full-text articles; a total of 120 full-text articles were reviewed.

RESULTS

A final sample of 45 articles met inclusion criteria for review and discussion. Articles were grouped into text, images, and biological sample categories. For text-based strategies, the approaches were segregated into heuristic, lexical, and pattern-based systems versus statistical learning-based systems. For images, approaches that de-identified photographic facial images and magnetic resonance image data were described. For biological samples, approaches that managed the identifiers linked with these samples were discussed, particularly with respect to meeting the anonymization requirements needed for Institutional Review Board exemption under the Common Rule.

CONCLUSIONS

Current de-identification strategies have their limitations, and statistical learning-based systems have distinct advantages over other approaches for the de-identification of free text. True anonymization is challenging, and further work is needed in the areas of de-identification of datasets and protection of genetic information.

摘要

背景

去识别和匿名化是用于去除电子健康记录数据中患者标识符的策略。鉴于需要在多个环境和机构之间共享电子健康记录数据,同时保护患者隐私,因此在多中心研究中使用这些策略至关重要。

方法

使用去识别、去标识、去识别、去标识、匿名化、匿名化、数据清洗和文本清洗等关键词进行系统文献检索。搜索截止日期为 2011 年 6 月 30 日,涉及 6 个不同的常用文献数据库。共确定了 1798 条前瞻性引用,有 94 篇全文文章符合审查标准,并获得了相应的文章。通过对 26 篇额外全文文章的回顾,补充了搜索结果;共审查了 120 篇全文文章。

结果

最终有 45 篇文章符合审查和讨论的纳入标准。文章分为文本、图像和生物样本类别。对于基于文本的策略,方法分为启发式、词汇和基于模式的系统与基于统计学习的系统。对于图像,描述了用于去识别摄影面部图像和磁共振图像数据的方法。对于生物样本,讨论了管理与这些样本相关联的标识符的方法,特别是在满足机构审查委员会豁免的常见规则下的匿名化要求方面。

结论

当前的去识别策略存在其局限性,基于统计学习的系统在去识别自由文本方面具有明显优于其他方法的优势。真正的匿名化具有挑战性,需要在数据集去识别和保护遗传信息方面进一步开展工作。

相似文献

5
Patient Privacy in the Era of Big Data.大数据时代的患者隐私
Balkan Med J. 2018 Jan 20;35(1):8-17. doi: 10.4274/balkanmedj.2017.0966. Epub 2017 Sep 13.
7
deidentify.去识别化
AMIA Annu Symp Proc. 2018 Apr 16;2017:485-494. eCollection 2017.

引用本文的文献

9
Symptom-BERT: Enhancing Cancer Symptom Detection in EHR Clinical Notes.症状-BERT:增强电子健康记录临床记录中的癌症症状检测
J Pain Symptom Manage. 2024 Aug;68(2):190-198.e1. doi: 10.1016/j.jpainsymman.2024.05.015. Epub 2024 May 23.

本文引用的文献

1
Deidentification of facial images using composites.使用合成图像对面部图像进行去识别处理。
J Oral Maxillofac Surg. 2011 Dec;69(12):3026-31. doi: 10.1016/j.joms.2011.01.011. Epub 2011 May 20.
2
The MITRE Identification Scrubber Toolkit: design, training, and assessment.MITRE 识别清理工具包:设计、培训和评估。
Int J Med Inform. 2010 Dec;79(12):849-59. doi: 10.1016/j.ijmedinf.2010.09.007. Epub 2010 Oct 14.
9
An open source toolkit for medical imaging de-identification.一个用于医学影像去标识化的开源工具包。
Eur Radiol. 2010 Aug;20(8):1896-904. doi: 10.1007/s00330-010-1745-3. Epub 2010 Mar 4.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验