• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

相似文献

1
Comparative Study of Various Approaches for Ensemble-based De-identification of Electronic Health Record Narratives.基于集成的电子健康记录叙述去识别方法的比较研究。
AMIA Annu Symp Proc. 2021 Jan 25;2020:648-657. eCollection 2020.
2
Ensemble-based Methods to Improve De-identification of Electronic Health Record Narratives.基于集成的方法以改善电子健康记录叙述的去识别化
AMIA Annu Symp Proc. 2018 Dec 5;2018:663-672. eCollection 2018.
3
A study of deep learning methods for de-identification of clinical notes in cross-institute settings.深度学习方法在跨机构环境下对临床记录进行去识别的研究。
BMC Med Inform Decis Mak. 2019 Dec 5;19(Suppl 5):232. doi: 10.1186/s12911-019-0935-4.
4
Ensemble method-based extraction of medication and related information from clinical texts.基于集成方法的临床文本中药物及相关信息的提取。
J Am Med Inform Assoc. 2020 Jan 1;27(1):31-38. doi: 10.1093/jamia/ocz100.
5
Text de-identification for privacy protection: a study of its impact on clinical text information content.用于隐私保护的文本去识别化:对其对临床文本信息内容影响的一项研究
J Biomed Inform. 2014 Aug;50:142-50. doi: 10.1016/j.jbi.2014.01.011. Epub 2014 Feb 3.
6
Using word embeddings to improve the privacy of clinical notes.利用词嵌入技术提高临床笔记的隐私性。
J Am Med Inform Assoc. 2020 Jun 1;27(6):901-907. doi: 10.1093/jamia/ocaa038.
7
Ensembles of natural language processing systems for portable phenotyping solutions.用于便携表型解决方案的自然语言处理系统集合。
J Biomed Inform. 2019 Dec;100:103318. doi: 10.1016/j.jbi.2019.103318. Epub 2019 Oct 23.
8
Impact of De-Identification on Clinical Text Classification Using Traditional and Deep Learning Classifiers.去识别化对使用传统和深度学习分类器的临床文本分类的影响。
Stud Health Technol Inform. 2019 Aug 21;264:283-287. doi: 10.3233/SHTI190228.
9
Impact Analysis of De-Identification in Clinical Notes Classification.临床笔记分类去标识化的影响分析。
Stud Health Technol Inform. 2022 May 16;293:189-196. doi: 10.3233/SHTI220368.
10
Combining knowledge- and data-driven methods for de-identification of clinical narratives.结合知识驱动和数据驱动方法对临床记录进行去识别化处理。
J Biomed Inform. 2015 Dec;58 Suppl(Suppl):S53-S59. doi: 10.1016/j.jbi.2015.06.029. Epub 2015 Jul 22.

引用本文的文献

1
Not Fully Synthetic: LLM-based Hybrid Approaches Towards Privacy-Preserving Clinical Note Sharing.非完全合成:基于大语言模型的隐私保护临床笔记共享混合方法。
AMIA Jt Summits Transl Sci Proc. 2025 Jun 10;2025:441-450. eCollection 2025.

本文引用的文献

1
An Empirical Test of GRUs and Deep Contextualized Word Representations on De-Identification.关于去识别化的门控循环单元(GRU)和深度语境化词表征的实证测试
Stud Health Technol Inform. 2019 Aug 21;264:218-222. doi: 10.3233/SHTI190215.
2
A Study of Medical Problem Extraction for Better Disease Management.一项关于医学问题提取以改善疾病管理的研究。
Stud Health Technol Inform. 2019 Aug 21;264:193-197. doi: 10.3233/SHTI190210.
3
Ensemble-based Methods to Improve De-identification of Electronic Health Record Narratives.基于集成的方法以改善电子健康记录叙述的去识别化
AMIA Annu Symp Proc. 2018 Dec 5;2018:663-672. eCollection 2018.
4
De-identification of psychiatric intake records: Overview of 2016 CEGS N-GRID shared tasks Track 1.去识别精神科入院记录:2016 年 CEGS N-GRID 共享任务跟踪 1 概述。
J Biomed Inform. 2017 Nov;75S:S4-S18. doi: 10.1016/j.jbi.2017.06.011. Epub 2017 Jun 11.
5
De-identification of clinical notes via recurrent neural network and conditional random field.通过递归神经网络和条件随机场对临床记录进行去识别。
J Biomed Inform. 2017 Nov;75S:S34-S42. doi: 10.1016/j.jbi.2017.05.023. Epub 2017 Jun 1.
6
MIMIC-III, a freely accessible critical care database.MIMIC-III,一个免费获取的重症监护数据库。
Sci Data. 2016 May 24;3:160035. doi: 10.1038/sdata.2016.35.
7
A Study of Concept Extraction Across Different Types of Clinical Notes.不同类型临床记录中的概念提取研究。
AMIA Annu Symp Proc. 2015 Nov 5;2015:737-46. eCollection 2015.
8
Annotating longitudinal clinical narratives for de-identification: The 2014 i2b2/UTHealth corpus.用于去识别化的纵向临床记录标注:2014年i2b2/德克萨斯大学健康科学中心语料库
J Biomed Inform. 2015 Dec;58 Suppl(Suppl):S20-S29. doi: 10.1016/j.jbi.2015.07.020. Epub 2015 Aug 28.
9
Automatic detection of protected health information from clinic narratives.从临床记录中自动检测受保护的健康信息。
J Biomed Inform. 2015 Dec;58 Suppl(Suppl):S30-S38. doi: 10.1016/j.jbi.2015.06.015. Epub 2015 Jul 29.
10
Automated systems for the de-identification of longitudinal clinical narratives: Overview of 2014 i2b2/UTHealth shared task Track 1.用于纵向临床记录去识别化的自动化系统:2014年i2b2/德克萨斯大学健康科学中心共享任务赛道1概述
J Biomed Inform. 2015 Dec;58 Suppl(Suppl):S11-S19. doi: 10.1016/j.jbi.2015.06.007. Epub 2015 Jul 28.

基于集成的电子健康记录叙述去识别方法的比较研究。

Comparative Study of Various Approaches for Ensemble-based De-identification of Electronic Health Record Narratives.

机构信息

Medical University of South Carolina, Charleston, South Carolina, USA.

Clinacuity, Inc., Charleston, South Carolina, USA.

出版信息

AMIA Annu Symp Proc. 2021 Jan 25;2020:648-657. eCollection 2020.

PMID:33936439
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8075417/
Abstract

De-identification of electric health record narratives is a fundamental task applying natural language processing to better protect patient information privacy. We explore different types of ensemble learning methods to improve clinical text de-identification. We present two ensemble-based approaches for combining multiple predictive models. The first method selects an optimal subset of de-identification models by greedy exclusion. This ensemble pruning allows one to save computational time or physical resources while achieving similar or better performance than the ensemble of all members. The second method uses a sequence of words to train a sequential model. For this sequence labelling-based stacked ensemble, we employ search-based structured prediction and bidirectional long short-term memory algorithms. We create ensembles consisting of de-identification models trained on two clinical text corpora. Experimental results show that our ensemble systems can effectively integrate predictions from individual models and offer better generalization across two different corpora.

摘要

去识别电子健康记录叙述是应用自然语言处理来更好地保护患者信息隐私的基本任务。我们探索了不同类型的集成学习方法来改进临床文本去识别。我们提出了两种基于集成的方法来组合多个预测模型。第一种方法通过贪婪排除选择最佳的去识别模型子集。这种集成剪枝可以节省计算时间或物理资源,同时实现与所有成员的集成相似或更好的性能。第二种方法使用单词序列来训练序列模型。对于基于序列标注的堆叠集成,我们采用基于搜索的结构化预测和双向长短期记忆算法。我们创建了由在两个临床文本语料库上训练的去识别模型组成的集成系统。实验结果表明,我们的集成系统可以有效地整合来自各个模型的预测,并在两个不同的语料库上提供更好的泛化能力。