文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

非完全合成:基于大语言模型的隐私保护临床笔记共享混合方法。

Not Fully Synthetic: LLM-based Hybrid Approaches Towards Privacy-Preserving Clinical Note Sharing.

作者信息

Rahman Sarkar Atiquer, Chuang Yao-Shun, Jiang Xiaoqian, Mohammed Noman

机构信息

University of Manitoba, Winnipeg, Manitoba, Canada.

出版信息

AMIA Jt Summits Transl Sci Proc. 2025 Jun 10;2025:441-450. eCollection 2025.


DOI:
PMID:40502247
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12150723/
Abstract

The publication and sharing of clinical notes are crucial for healthcare research and innovation. However, privacy regulations such as HIPAA and GDPR pose significant challenges. While de-identification techniques aim to remove protected health information, they often fall short of achieving complete privacy protection. Similarly, the current state of synthetic clinical note generation can lack nuance and content coverage. To address these limitations, we propose an approach that combines de-identification, filtration, and synthetic clinical note generation. Variations of this approach currently retain 36%-61% of the original note's content and fill the remaining gaps using an LLM, ensuring high information coverage. We also evaluated the de-identification performance of the hybrid notes, demonstrating that they surpass or at least match the standalone de-identification methods. Our results show that hybrid notes can maintain patient privacy while preserving the richness of clinical data. This approach offers a promising solution for safe and effective data sharing, encouraging further research.

摘要

临床记录的发布和共享对医疗保健研究与创新至关重要。然而,诸如《健康保险流通与责任法案》(HIPAA)和《通用数据保护条例》(GDPR)等隐私法规带来了重大挑战。虽然去识别技术旨在去除受保护的健康信息,但它们往往无法实现完全的隐私保护。同样,合成临床记录生成的现状可能缺乏细微差别和内容覆盖范围。为了解决这些限制,我们提出了一种结合去识别、过滤和合成临床记录生成的方法。这种方法的变体目前保留了原始记录36% - 61%的内容,并使用语言模型(LLM)填补其余空白,确保高信息覆盖率。我们还评估了混合记录的去识别性能,证明它们超越或至少与独立的去识别方法相当。我们的结果表明,混合记录可以在保护患者隐私的同时保留临床数据的丰富性。这种方法为安全有效的数据共享提供了一个有前景的解决方案,鼓励进一步研究。

相似文献

[1]
Not Fully Synthetic: LLM-based Hybrid Approaches Towards Privacy-Preserving Clinical Note Sharing.

AMIA Jt Summits Transl Sci Proc. 2025-6-10

[2]
Survivor, family and professional experiences of psychosocial interventions for sexual abuse and violence: a qualitative evidence synthesis.

Cochrane Database Syst Rev. 2022-10-4

[3]
The Black Book of Psychotropic Dosing and Monitoring.

Psychopharmacol Bull. 2024-7-8

[4]
A Spectrum of Understanding: A Qualitative Exploration of Autistic Adults' Understandings and Perceptions of Friendship(s).

Autism Adulthood. 2024-12-2

[5]
Privacy-Preserving Glycemic Management in Type 1 Diabetes: Development and Validation of a Multiobjective Federated Reinforcement Learning Framework.

JMIR Diabetes. 2025-7-4

[6]
Stakeholders' perceptions and experiences of factors influencing the commissioning, delivery, and uptake of general health checks: a qualitative evidence synthesis.

Cochrane Database Syst Rev. 2025-3-20

[7]
Consequences, costs and cost-effectiveness of workforce configurations in English acute hospitals.

Health Soc Care Deliv Res. 2025-7

[8]
Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.

Cochrane Database Syst Rev. 2022-5-20

[9]
The quantity, quality and findings of network meta-analyses evaluating the effectiveness of GLP-1 RAs for weight loss: a scoping review.

Health Technol Assess. 2025-6-25

[10]
The health economics of insulin therapy: How do we address the rising demands, costs, inequalities and barriers to achieving optimal outcomes.

Diabetes Obes Metab. 2025-7

本文引用的文献

[1]
Robust privacy amidst innovation with large language models through a critical assessment of the risks.

J Am Med Inform Assoc. 2025-5-1

[2]
De-identification is not enough: a comparison between de-identified and synthetic clinical notes.

Sci Rep. 2024-11-29

[3]
De-identification of clinical free text using natural language processing: A systematic review of current approaches.

Artif Intell Med. 2024-5

[4]
De-identification of free text data containing personal health information: a scoping review of reviews.

Int J Popul Data Sci. 2023

[5]
Discovering biomarkers associated and predicting cardiovascular disease with high accuracy using a novel nexus of machine learning techniques for precision medicine.

Sci Rep. 2024-1-2

[6]
The RareDis corpus: A corpus annotated with rare diseases, their signs and symptoms.

J Biomed Inform. 2022-1

[7]
An Accurate Deep Learning Model for Clinical Entity Recognition From Clinical Notes.

IEEE J Biomed Health Inform. 2021-10

[8]
Comparative Study of Various Approaches for Ensemble-based De-identification of Electronic Health Record Narratives.

AMIA Annu Symp Proc. 2020

[9]
HunFlair: an easy-to-use tool for state-of-the-art biomedical named entity recognition.

Bioinformatics. 2021-9-9

[10]
Predicting mortality in critically ill patients with diabetes using machine learning and clinical notes.

BMC Med Inform Decis Mak. 2020-12-30

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索