• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

相似文献

1
Learning to identify Protected Health Information by integrating knowledge- and data-driven algorithms: A case study on psychiatric evaluation notes.通过整合知识和数据驱动算法来学习识别受保护的健康信息:一项关于精神科评估记录的案例研究。
J Biomed Inform. 2017 Nov;75S:S28-S33. doi: 10.1016/j.jbi.2017.06.005. Epub 2017 Jun 7.
2
Automated de-identification of free-text medical records.自由文本医疗记录的自动去识别化
BMC Med Inform Decis Mak. 2008 Jul 24;8:32. doi: 10.1186/1472-6947-8-32.
3
Combining knowledge- and data-driven methods for de-identification of clinical narratives.结合知识驱动和数据驱动方法对临床记录进行去识别化处理。
J Biomed Inform. 2015 Dec;58 Suppl(Suppl):S53-S59. doi: 10.1016/j.jbi.2015.06.029. Epub 2015 Jul 22.
4
Sensitive Data Detection with High-Throughput Machine Learning Models in Electrical Health Records.基于电生理健康记录的高通量机器学习模型的敏感数据检测。
AMIA Annu Symp Proc. 2024 Jan 11;2023:814-823. eCollection 2023.
5
Automatic de-identification of textual documents in the electronic health record: a review of recent research.电子健康记录中文本文件的自动去识别:近期研究综述。
BMC Med Res Methodol. 2010 Aug 2;10:70. doi: 10.1186/1471-2288-10-70.
6
De-identification of free text data containing personal health information: a scoping review of reviews.去标识化包含个人健康信息的自由文本数据:综述的综述。
Int J Popul Data Sci. 2023 Dec 12;8(1):2153. doi: 10.23889/ijpds.v8i1.2153. eCollection 2023.
7
Assessing the difficulty and time cost of de-identification in clinical narratives.评估临床记录中去识别化的难度和时间成本。
Methods Inf Med. 2006;45(3):246-52.
8
Robust privacy amidst innovation with large language models through a critical assessment of the risks.通过对风险的批判性评估,在大语言模型创新中实现强大的隐私保护。
J Am Med Inform Assoc. 2025 May 1;32(5):885-892. doi: 10.1093/jamia/ocaf037.
9
De-identification of clinical notes via recurrent neural network and conditional random field.通过递归神经网络和条件随机场对临床记录进行去识别。
J Biomed Inform. 2017 Nov;75S:S34-S42. doi: 10.1016/j.jbi.2017.05.023. Epub 2017 Jun 1.
10
A hybrid approach to automatic de-identification of psychiatric notes.一种混合方法,用于自动识别精神科病历中的身份信息。
J Biomed Inform. 2017 Nov;75S:S19-S27. doi: 10.1016/j.jbi.2017.06.006. Epub 2017 Jun 7.

引用本文的文献

1
De-identification of free text data containing personal health information: a scoping review of reviews.去标识化包含个人健康信息的自由文本数据:综述的综述。
Int J Popul Data Sci. 2023 Dec 12;8(1):2153. doi: 10.23889/ijpds.v8i1.2153. eCollection 2023.
2
Understanding Views Around the Creation of a Consented, Donated Databank of Clinical Free Text to Develop and Train Natural Language Processing Models for Research: Focus Group Interviews With Stakeholders.了解围绕创建经同意捐赠的临床自由文本数据库以开发和训练用于研究的自然语言处理模型的各方观点:与利益相关者进行焦点小组访谈。
JMIR Med Inform. 2023 May 3;11:e45534. doi: 10.2196/45534.
3
Clinical concept extraction: A methodology review.临床概念提取:方法学综述。
J Biomed Inform. 2020 Sep;109:103526. doi: 10.1016/j.jbi.2020.103526. Epub 2020 Aug 6.
4
Should free-text data in electronic medical records be shared for research? A citizens' jury study in the UK.电子病历中的自由文本数据是否应共享用于研究?英国的一个公民陪审团研究。
J Med Ethics. 2020 Jun;46(6):367-377. doi: 10.1136/medethics-2019-105472. Epub 2020 May 26.
5
Re-examination of Rule-Based Methods in Deidentification of Electronic Health Records: Algorithm Development and Validation.电子健康记录去识别化中基于规则方法的重新审视:算法开发与验证
JMIR Med Inform. 2020 Apr 30;8(4):e17622. doi: 10.2196/17622.
6
Clinical Text Data in Machine Learning: Systematic Review.机器学习中的临床文本数据:系统综述
JMIR Med Inform. 2020 Mar 31;8(3):e17984. doi: 10.2196/17984.
7
A natural language processing challenge for clinical records: Research Domains Criteria (RDoC) for psychiatry.临床记录面临的自然语言处理挑战:精神病学的研究领域标准(RDoC)
J Biomed Inform. 2017 Nov;75S:S1-S3. doi: 10.1016/j.jbi.2017.10.005. Epub 2017 Oct 16.
8
De-identification of psychiatric intake records: Overview of 2016 CEGS N-GRID shared tasks Track 1.去识别精神科入院记录:2016 年 CEGS N-GRID 共享任务跟踪 1 概述。
J Biomed Inform. 2017 Nov;75S:S4-S18. doi: 10.1016/j.jbi.2017.06.011. Epub 2017 Jun 11.

本文引用的文献

1
De-identification of psychiatric intake records: Overview of 2016 CEGS N-GRID shared tasks Track 1.去识别精神科入院记录:2016 年 CEGS N-GRID 共享任务跟踪 1 概述。
J Biomed Inform. 2017 Nov;75S:S4-S18. doi: 10.1016/j.jbi.2017.06.011. Epub 2017 Jun 11.
2
A unified framework for evaluating the risk of re-identification of text de-identification tools.用于评估文本去识别工具重新识别风险的统一框架。
J Biomed Inform. 2016 Oct;63:174-183. doi: 10.1016/j.jbi.2016.07.015. Epub 2016 Jul 15.
3
Is the Juice Worth the Squeeze? Costs and Benefits of Multiple Human Annotators for Clinical Text De-identification.榨取成果是否值得?多名人工标注者在临床文本去识别化中的成本与收益
Methods Inf Med. 2016 Aug 5;55(4):356-64. doi: 10.3414/ME15-01-0122. Epub 2016 Jul 13.
4
Challenges and Insights in Using HIPAA Privacy Rule for Clinical Text Annotation.使用《健康保险流通与责任法案》隐私规则进行临床文本注释的挑战与见解。
AMIA Annu Symp Proc. 2015 Nov 5;2015:707-16. eCollection 2015.
5
Automatic detection of protected health information from clinic narratives.从临床记录中自动检测受保护的健康信息。
J Biomed Inform. 2015 Dec;58 Suppl(Suppl):S30-S38. doi: 10.1016/j.jbi.2015.06.015. Epub 2015 Jul 29.
6
Automated systems for the de-identification of longitudinal clinical narratives: Overview of 2014 i2b2/UTHealth shared task Track 1.用于纵向临床记录去识别化的自动化系统:2014年i2b2/德克萨斯大学健康科学中心共享任务赛道1概述
J Biomed Inform. 2015 Dec;58 Suppl(Suppl):S11-S19. doi: 10.1016/j.jbi.2015.06.007. Epub 2015 Jul 28.
7
Combining knowledge- and data-driven methods for de-identification of clinical narratives.结合知识驱动和数据驱动方法对临床记录进行去识别化处理。
J Biomed Inform. 2015 Dec;58 Suppl(Suppl):S53-S59. doi: 10.1016/j.jbi.2015.06.029. Epub 2015 Jul 22.
8
Evaluating the effects of machine pre-annotation and an interactive annotation interface on manual de-identification of clinical text.评估机器预标注和交互式标注界面在临床文本人工去识别化方面的效果。
J Biomed Inform. 2014 Aug;50:162-72. doi: 10.1016/j.jbi.2014.05.002. Epub 2014 May 20.
9
Text de-identification for privacy protection: a study of its impact on clinical text information content.用于隐私保护的文本去识别化:对其对临床文本信息内容影响的一项研究
J Biomed Inform. 2014 Aug;50:142-50. doi: 10.1016/j.jbi.2014.01.011. Epub 2014 Feb 3.
10
Development and evaluation of a de-identification procedure for a case register sourced from mental health electronic records.开发和评估一种从精神健康电子记录来源的病例登记中去除识别信息的程序。
BMC Med Inform Decis Mak. 2013 Jul 11;13:71. doi: 10.1186/1472-6947-13-71.

通过整合知识和数据驱动算法来学习识别受保护的健康信息:一项关于精神科评估记录的案例研究。

Learning to identify Protected Health Information by integrating knowledge- and data-driven algorithms: A case study on psychiatric evaluation notes.

机构信息

School of Computer Science, University of Manchester, Manchester, UK; The Christie NHS Foundation Trust, Manchester, UK.

Faculty of Technical Sciences, University of Novi Sad, Novi Sad, Serbia.

出版信息

J Biomed Inform. 2017 Nov;75S:S28-S33. doi: 10.1016/j.jbi.2017.06.005. Epub 2017 Jun 7.

DOI:10.1016/j.jbi.2017.06.005
PMID:28602908
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5705401/
Abstract

De-identification of clinical narratives is one of the main obstacles to making healthcare free text available for research. In this paper we describe our experience in expanding and tailoring two existing tools as part of the 2016 CEGS N-GRID Shared Tasks Track 1, which evaluated de-identification methods on a set of psychiatric evaluation notes for up to 25 different types of Protected Health Information (PHI). The methods we used rely on machine learning on either a large or small feature space, with additional strategies, including two-pass tagging and multi-class models, which both proved to be beneficial. The results show that the integration of the proposed methods can identify Health Information Portability and Accountability Act (HIPAA) defined PHIs with overall F-scores of ∼90% and above. Yet, some classes (Profession, Organization) proved again to be challenging given the variability of expressions used to reference given information.

摘要

去识别临床叙述是使医疗保健自由文本可用于研究的主要障碍之一。在本文中,我们描述了在 2016 年 CEGS N-GRID 共享任务跟踪 1 中扩展和调整两个现有工具的经验,该任务在一组最多 25 种不同类型的受保护健康信息 (PHI) 的精神科评估记录上评估去识别方法。我们使用的方法依赖于大型或小型特征空间上的机器学习,以及包括两阶段标记和多类模型在内的其他策略,这两者都被证明是有益的。结果表明,所提出的方法的集成可以识别健康保险携带和责任法案 (HIPAA) 定义的 PHI,总体 F 分数约为 90%及以上。然而,某些类别(专业、组织)再次被证明具有挑战性,因为用于引用给定信息的表达式具有可变性。