文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

一种混合方法,用于自动识别精神科病历中的身份信息。

A hybrid approach to automatic de-identification of psychiatric notes.

机构信息

School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, United States.

School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, United States.

出版信息

J Biomed Inform. 2017 Nov;75S:S19-S27. doi: 10.1016/j.jbi.2017.06.006. Epub 2017 Jun 7.


DOI:10.1016/j.jbi.2017.06.006
PMID:28602904
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5705430/
Abstract

De-identification, or identifying and removing protected health information (PHI) from clinical data, is a critical step in making clinical data available for clinical applications and research. This paper presents a natural language processing system for automatic de-identification of psychiatric notes, which was designed to participate in the 2016 CEGS N-GRID shared task Track 1. The system has a hybrid structure that combines machine leaning techniques and rule-based approaches. The rule-based components exploit the structure of the psychiatric notes as well as characteristic surface patterns of PHI mentions. The machine learning components utilize supervised learning with rich features. In addition, the system performance was boosted with integration of additional data to the training set through domain adaptation. The hybrid system showed overall micro-averaged F-score 90.74 on the test set, second-best among all the participants of the CEGS N-GRID task.

摘要

去识别化,或者识别和移除临床数据中的保护健康信息 (PHI),是将临床数据用于临床应用和研究的关键步骤。本文提出了一种自然语言处理系统,用于自动识别精神科病历中的去识别化,该系统旨在参加 2016 年 CEGS N-GRID 共享任务第 1 轨道。该系统具有混合结构,结合了机器学习技术和基于规则的方法。基于规则的组件利用精神科病历的结构以及 PHI 提及的特征表面模式。机器学习组件利用带有丰富特征的监督学习。此外,通过通过域自适应将额外的数据集成到训练集中,系统性能得到了提升。混合系统在测试集上的整体微观平均 F1 得分为 90.74,在 CEGS N-GRID 任务的所有参与者中排名第二。

相似文献

[1]
A hybrid approach to automatic de-identification of psychiatric notes.

J Biomed Inform. 2017-6-7

[2]
De-identification of psychiatric intake records: Overview of 2016 CEGS N-GRID shared tasks Track 1.

J Biomed Inform. 2017-6-11

[3]
De-identification of clinical notes via recurrent neural network and conditional random field.

J Biomed Inform. 2017-6-1

[4]
The UAB Informatics Institute and 2016 CEGS N-GRID de-identification shared task challenge.

J Biomed Inform. 2017-5-3

[5]
Automatic detection of protected health information from clinic narratives.

J Biomed Inform. 2015-12

[6]
De-identification of clinical free text using natural language processing: A systematic review of current approaches.

Artif Intell Med. 2024-5

[7]
De-identification of medical records using conditional random fields and long short-term memory networks.

J Biomed Inform. 2017-10-13

[8]
Automatic de-identification of French electronic health records: a cost-effective approach exploiting distant supervision and deep learning models.

BMC Med Inform Decis Mak. 2024-2-16

[9]
Automatic mining of symptom severity from psychiatric evaluation notes.

Int J Methods Psychiatr Res. 2017-12-22

[10]
Publicly available machine learning models for identifying opioid misuse from the clinical notes of hospitalized patients.

BMC Med Inform Decis Mak. 2020-4-29

引用本文的文献

[1]
Evaluating GPT models for clinical note de-identification.

Sci Rep. 2025-1-31

[2]
pyDeid: an improved, fast, flexible, and generalizable rule-based approach for deidentification of free-text medical records.

JAMIA Open. 2025-1-22

[3]
Clinical concept annotation with contextual word embedding in active transfer learning environment.

Digit Health. 2024-12-19

[4]
De-identification of free text data containing personal health information: a scoping review of reviews.

Int J Popul Data Sci. 2023

[5]
OpenDeID Pipeline for Unstructured Electronic Health Record Text Notes Based on Rules and Transformers: Deidentification Algorithm Development and Validation Study.

J Med Internet Res. 2023-12-6

[6]
Web-Based Application Based on Human-in-the-Loop Deep Learning for Deidentifying Free-Text Data in Electronic Medical Records: Development and Usability Study.

Interact J Med Res. 2023-8-25

[7]
Transferability of neural network clinical deidentification systems.

J Am Med Inform Assoc. 2021-11-25

[8]
Improving domain adaptation in de-identification of electronic health records through self-training.

J Am Med Inform Assoc. 2021-9-18

[9]
Deidentification of free-text medical records using pre-trained bidirectional transformers.

Proc ACM Conf Health Inference Learn (2020). 2020-4

[10]
Building a best-in-class automated de-identification tool for electronic health records through ensemble learning.

Patterns (N Y). 2021-5-12

本文引用的文献

[1]
De-identification of psychiatric intake records: Overview of 2016 CEGS N-GRID shared tasks Track 1.

J Biomed Inform. 2017-6-11

[2]
De-identification of patient notes with recurrent neural networks.

J Am Med Inform Assoc. 2017-5-1

[3]
Hidden Markov model using Dirichlet process for de-identification.

J Biomed Inform. 2015-12

[4]
Automatic detection of protected health information from clinic narratives.

J Biomed Inform. 2015-12

[5]
Automated systems for the de-identification of longitudinal clinical narratives: Overview of 2014 i2b2/UTHealth shared task Track 1.

J Biomed Inform. 2015-12

[6]
Combining knowledge- and data-driven methods for de-identification of clinical narratives.

J Biomed Inform. 2015-12

[7]
Automatic de-identification of electronic medical records using token-level and character-level conditional random fields.

J Biomed Inform. 2015-12

[8]
Domain adaptation for semantic role labeling of clinical text.

J Am Med Inform Assoc. 2015-9

[9]
Pharmacovigilance from social media: mining adverse drug reaction mentions using sequence labeling with word embedding cluster features.

J Am Med Inform Assoc. 2015-5

[10]
Improved de-identification of physician notes through integrative modeling of both public and private medical text.

BMC Med Inform Decis Mak. 2013-10-2

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索