Rahman Sarkar Atiquer, Chuang Yao-Shun, Jiang Xiaoqian, Mohammed Noman
University of Manitoba, Winnipeg, Manitoba, Canada.
AMIA Jt Summits Transl Sci Proc. 2025 Jun 10;2025:441-450. eCollection 2025.
The publication and sharing of clinical notes are crucial for healthcare research and innovation. However, privacy regulations such as HIPAA and GDPR pose significant challenges. While de-identification techniques aim to remove protected health information, they often fall short of achieving complete privacy protection. Similarly, the current state of synthetic clinical note generation can lack nuance and content coverage. To address these limitations, we propose an approach that combines de-identification, filtration, and synthetic clinical note generation. Variations of this approach currently retain 36%-61% of the original note's content and fill the remaining gaps using an LLM, ensuring high information coverage. We also evaluated the de-identification performance of the hybrid notes, demonstrating that they surpass or at least match the standalone de-identification methods. Our results show that hybrid notes can maintain patient privacy while preserving the richness of clinical data. This approach offers a promising solution for safe and effective data sharing, encouraging further research.
临床记录的发布和共享对医疗保健研究与创新至关重要。然而,诸如《健康保险流通与责任法案》(HIPAA)和《通用数据保护条例》(GDPR)等隐私法规带来了重大挑战。虽然去识别技术旨在去除受保护的健康信息,但它们往往无法实现完全的隐私保护。同样,合成临床记录生成的现状可能缺乏细微差别和内容覆盖范围。为了解决这些限制,我们提出了一种结合去识别、过滤和合成临床记录生成的方法。这种方法的变体目前保留了原始记录36% - 61%的内容,并使用语言模型(LLM)填补其余空白,确保高信息覆盖率。我们还评估了混合记录的去识别性能,证明它们超越或至少与独立的去识别方法相当。我们的结果表明,混合记录可以在保护患者隐私的同时保留临床数据的丰富性。这种方法为安全有效的数据共享提供了一个有前景的解决方案,鼓励进一步研究。
AMIA Jt Summits Transl Sci Proc. 2025-6-10
Cochrane Database Syst Rev. 2022-10-4
Psychopharmacol Bull. 2024-7-8
Health Soc Care Deliv Res. 2025-7
Cochrane Database Syst Rev. 2022-5-20
J Am Med Inform Assoc. 2025-5-1
Int J Popul Data Sci. 2023
J Biomed Inform. 2022-1
IEEE J Biomed Health Inform. 2021-10
Bioinformatics. 2021-9-9
BMC Med Inform Decis Mak. 2020-12-30