• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

重新利用临床记录:现有的自然语言处理系统能否对临床笔记进行去识别化处理?

Repurposing the clinical record: can an existing natural language processing system de-identify clinical notes?

作者信息

Morrison Frances P, Li Li, Lai Albert M, Hripcsak George

机构信息

Columbia University Department of Biomedical Informatics, New York, NY, USA.

出版信息

J Am Med Inform Assoc. 2009 Jan-Feb;16(1):37-9. doi: 10.1197/jamia.M2862. Epub 2008 Oct 24.

DOI:10.1197/jamia.M2862
PMID:18952938
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2605586/
Abstract

Electronic clinical documentation can be useful for activities such as public health surveillance, quality improvement, and research, but existing methods of de-identification may not provide sufficient protection of patient data. The general-purpose natural language processor MedLEE retains medical concepts while excluding the remaining text so, in addition to processing text into structured data, it may be able provide a secondary benefit of de-identification. Without modifying the system, the authors tested the ability of MedLEE to remove protected health information (PHI) by comparing 100 outpatient clinical notes with the corresponding XML-tagged output. Of 809 instances of PHI, 26 (3.2%) were detected in output as a result of processing and identification errors. However, PHI in the output was highly transformed, much appearing as normalized terms for medical concepts, potentially making re-identification more difficult. The MedLEE processor may be a good enhancement to other de-identification systems, both removing PHI and providing coded data from clinical text.

摘要

电子临床文档对于公共卫生监测、质量改进和研究等活动可能很有用,但现有的去识别方法可能无法充分保护患者数据。通用自然语言处理器MedLEE保留医学概念,同时排除其余文本,因此,除了将文本处理为结构化数据外,它还可能提供去识别的次要好处。在不修改系统的情况下,作者通过将100份门诊临床记录与相应的XML标记输出进行比较,测试了MedLEE去除受保护健康信息(PHI)的能力。在809个PHI实例中,有26个(3.2%)由于处理和识别错误在输出中被检测到。然而,输出中的PHI经过了高度转换,许多表现为医学概念的标准化术语,这可能使重新识别更加困难。MedLEE处理器可能是对其他去识别系统的一个很好的增强,既能去除PHI,又能从临床文本中提供编码数据。

相似文献

1
Repurposing the clinical record: can an existing natural language processing system de-identify clinical notes?重新利用临床记录:现有的自然语言处理系统能否对临床笔记进行去识别化处理?
J Am Med Inform Assoc. 2009 Jan-Feb;16(1):37-9. doi: 10.1197/jamia.M2862. Epub 2008 Oct 24.
2
Using a pipeline to improve de-identification performance.使用管道来提高去识别性能。
AMIA Annu Symp Proc. 2009 Nov 14;2009:447-51.
3
Automated de-identification of free-text medical records.自由文本医疗记录的自动去识别化
BMC Med Inform Decis Mak. 2008 Jul 24;8:32. doi: 10.1186/1472-6947-8-32.
4
Evaluation of PHI Hunter in Natural Language Processing Research.PHI Hunter在自然语言处理研究中的评估
Perspect Health Inf Manag. 2015 Jan 1;12(Winter):1f. eCollection 2015.
5
Text de-identification for privacy protection: a study of its impact on clinical text information content.用于隐私保护的文本去识别化:对其对临床文本信息内容影响的一项研究
J Biomed Inform. 2014 Aug;50:142-50. doi: 10.1016/j.jbi.2014.01.011. Epub 2014 Feb 3.
6
Disseminating natural language processed clinical narratives.传播经过自然语言处理的临床记录。
AMIA Annu Symp Proc. 2006;2006:126-30.
7
Proposal and evaluation of FASDIM, a Fast And Simple De-Identification Method for unstructured free-text clinical records.提出并评估了 FASDIM,一种用于非结构化自由文本临床记录的快速简便去识别方法。
Int J Med Inform. 2014 Apr;83(4):303-12. doi: 10.1016/j.ijmedinf.2013.11.005. Epub 2013 Dec 7.
8
CliniViewer: a tool for viewing electronic medical records based on natural language processing and XML.临床视图器:一种基于自然语言处理和XML的电子病历查看工具。
Stud Health Technol Inform. 2004;107(Pt 1):639-43.
9
Can physicians recognize their own patients in de-identified notes?医生能从去识别化的记录中认出自己的患者吗?
Stud Health Technol Inform. 2014;205:778-82.
10
Evaluating the state-of-the-art in automatic de-identification.评估自动去识别技术的最新进展。
J Am Med Inform Assoc. 2007 Sep-Oct;14(5):550-63. doi: 10.1197/jamia.M2444. Epub 2007 Jun 28.

引用本文的文献

1
Harnessing Moderate-Sized Language Models for Reliable Patient Data Deidentification in Emergency Department Records: Algorithm Development, Validation, and Implementation Study.利用中等规模语言模型对急诊科记录中的患者数据进行可靠去识别:算法开发、验证与实施研究。
JMIR AI. 2025 Apr 1;4:e57828. doi: 10.2196/57828.
2
De-identification of free text data containing personal health information: a scoping review of reviews.去标识化包含个人健康信息的自由文本数据:综述的综述。
Int J Popul Data Sci. 2023 Dec 12;8(1):2153. doi: 10.23889/ijpds.v8i1.2153. eCollection 2023.
3
Building a best-in-class automated de-identification tool for electronic health records through ensemble learning.通过集成学习构建用于电子健康记录的一流自动去识别工具。
Patterns (N Y). 2021 May 12;2(6):100255. doi: 10.1016/j.patter.2021.100255. eCollection 2021 Jun 11.
4
Resilience of clinical text de-identified with "hiding in plain sight" to hostile reidentification attacks by human readers.临床去标识文本的“以明掩暗”抵御人类读者敌对重新识别攻击的弹性。
J Am Med Inform Assoc. 2020 Jul 1;27(9):1374-1382. doi: 10.1093/jamia/ocaa095.
5
The machine giveth and the machine taketh away: a parrot attack on clinical text deidentified with hiding in plain sight.机器给予,机器又夺走:隐藏在明处的鹦鹉攻击对临床文本去识别。
J Am Med Inform Assoc. 2019 Dec 1;26(12):1536-1544. doi: 10.1093/jamia/ocz114.
6
Natural language processing systems for capturing and standardizing unstructured clinical information: A systematic review.用于捕获和标准化非结构化临床信息的自然语言处理系统:一项系统综述。
J Biomed Inform. 2017 Sep;73:14-29. doi: 10.1016/j.jbi.2017.07.012. Epub 2017 Jul 17.
7
Learning to identify Protected Health Information by integrating knowledge- and data-driven algorithms: A case study on psychiatric evaluation notes.通过整合知识和数据驱动算法来学习识别受保护的健康信息:一项关于精神科评估记录的案例研究。
J Biomed Inform. 2017 Nov;75S:S28-S33. doi: 10.1016/j.jbi.2017.06.005. Epub 2017 Jun 7.
8
De-identification of patient notes with recurrent neural networks.使用递归神经网络对患者记录进行去识别化处理。
J Am Med Inform Assoc. 2017 May 1;24(3):596-606. doi: 10.1093/jamia/ocw156.
9
Automatic detection of protected health information from clinic narratives.从临床记录中自动检测受保护的健康信息。
J Biomed Inform. 2015 Dec;58 Suppl(Suppl):S30-S38. doi: 10.1016/j.jbi.2015.06.015. Epub 2015 Jul 29.
10
Combining knowledge- and data-driven methods for de-identification of clinical narratives.结合知识驱动和数据驱动方法对临床记录进行去识别化处理。
J Biomed Inform. 2015 Dec;58 Suppl(Suppl):S53-S59. doi: 10.1016/j.jbi.2015.06.029. Epub 2015 Jul 22.

本文引用的文献

1
Medical i2b2 NLP smoking challenge: the A-Life system architecture and methodology.医学i2b2自然语言处理吸烟挑战:A-Life系统架构与方法
J Am Med Inform Assoc. 2008 Jan-Feb;15(1):40-3. doi: 10.1197/jamia.M2438. Epub 2007 Oct 18.
2
State-of-the-art anonymization of medical records using an iterative machine learning framework.使用迭代机器学习框架对病历进行最先进的匿名化处理。
J Am Med Inform Assoc. 2007 Sep-Oct;14(5):574-80. doi: 10.1197/j.jamia.M2441.
3
Evaluating the state-of-the-art in automatic de-identification.评估自动去识别技术的最新进展。
J Am Med Inform Assoc. 2007 Sep-Oct;14(5):550-63. doi: 10.1197/jamia.M2444. Epub 2007 Jun 28.
4
Hospital electronic medical record-based public health surveillance system deployed during the 2002 Winter Olympic Games.2002年冬季奥运会期间部署的基于医院电子病历的公共卫生监测系统。
Am J Infect Control. 2007 Apr;35(3):163-71. doi: 10.1016/j.ajic.2006.08.003.
5
Automated detection of adverse events using natural language processing of discharge summaries.利用出院小结的自然语言处理自动检测不良事件。
J Am Med Inform Assoc. 2005 Jul-Aug;12(4):448-57. doi: 10.1197/jamia.M1794. Epub 2005 Mar 31.
6
Evaluation of a deidentification (De-Id) software engine to share pathology reports and clinical documents for research.评估一种用于共享病理学报告和临床文档以进行研究的去识别化(De-Id)软件引擎。
Am J Clin Pathol. 2004 Feb;121(2):176-86. doi: 10.1309/E6K3-3GBP-E5C2-7FYU.
7
Use of natural language processing to translate clinical information from a database of 889,921 chest radiographic reports.利用自然语言处理技术从包含889,921份胸部X光报告的数据库中翻译临床信息。
Radiology. 2002 Jul;224(1):157-63. doi: 10.1148/radiol.2241011118.
8
Limited parsing of notational text visit notes: ad-hoc vs. NLP approaches.病历记录文本的有限解析:临时方法与自然语言处理方法对比
Proc AMIA Symp. 2000:51-5.
9
Automating a severity score guideline for community-acquired pneumonia employing medical language processing of discharge summaries.利用出院小结的医学语言处理技术,实现社区获得性肺炎严重程度评分指南的自动化。
Proc AMIA Symp. 1999:256-60.