匿名化与共享医学文本记录

Anonymizing and Sharing Medical Text Records.

作者信息

Li Xiao-Bai, Qin Jialun

机构信息

Department of Operations and Information Systems, Manning School of Business, University of Massachusetts Lowell, Lowell, Massachusetts 01854.

出版信息

Inf Syst Res. 2017;28(2):332-352. doi: 10.1287/isre.2016.0676. Epub 2017 Apr 12.

DOI:10.1287/isre.2016.0676

PMID:29569650

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5858761/

Abstract

Health information technology has increased accessibility of health and medical data and benefited medical research and healthcare management. However, there are rising concerns about patient privacy in sharing medical and healthcare data. A large amount of these data are in free text form. Existing techniques for privacy-preserving data sharing deal largely with structured data. Current privacy approaches for medical text data focus on detection and removal of patient identifiers from the data, which may be inadequate for protecting privacy or preserving data quality. We propose a new systematic approach to extract, cluster, and anonymize medical text records. Our approach integrates methods developed in both data privacy and health informatics fields. The key novel elements of our approach include a recursive partitioning method to cluster medical text records based on the similarity of the health and medical information and a value-enumeration method to anonymize potentially identifying information in the text data. An experimental study is conducted using real-world medical documents. The results of the experiments demonstrate the effectiveness of the proposed approach.

摘要

健康信息技术提高了健康和医疗数据的可获取性，对医学研究和医疗保健管理有益。然而，在共享医疗和保健数据时，患者隐私问题日益受到关注。这些数据中有大量是自由文本形式。现有的隐私保护数据共享技术主要处理结构化数据。当前针对医学文本数据的隐私方法主要集中于从数据中检测和去除患者标识符，这对于保护隐私或保持数据质量可能并不足够。我们提出一种新的系统方法来提取、聚类和匿名化医学文本记录。我们的方法整合了数据隐私和健康信息学领域开发的方法。我们方法的关键新颖元素包括一种基于健康和医学信息的相似性对医学文本记录进行聚类的递归划分方法，以及一种对文本数据中潜在的识别信息进行匿名化的数值枚举方法。使用真实世界的医学文档进行了一项实验研究。实验结果证明了所提方法的有效性。

相似文献

Anonymizing and Sharing Medical Text Records.匿名化与共享医学文本记录

Inf Syst Res. 2017;28(2):332-352. doi: 10.1287/isre.2016.0676. Epub 2017 Apr 12.

Utility-preserving anonymization for health data publishing.用于健康数据发布的效用保持匿名化

BMC Med Inform Decis Mak. 2017 Jul 11;17(1):104. doi: 10.1186/s12911-017-0499-0.

Digression and Value Concatenation to Enable Privacy-Preserving Regression.用于实现隐私保护回归的离题与值串联

MIS Q. 2014 Sep;38(3):679-698. doi: 10.25300/misq/2014/38.3.03.

Protecting Privacy When Sharing and Releasing Data with Multiple Records per Person.在为每人共享和发布多记录数据时保护隐私。

J Assoc Inf Syst. 2020;21(6):1461-1485. doi: 10.17705/1jais.00643.

Privacy preserving data anonymization of spontaneous ADE reporting system dataset.自发不良药物事件报告系统数据集的隐私保护数据匿名化

BMC Med Inform Decis Mak. 2016 Jul 18;16 Suppl 1(Suppl 1):58. doi: 10.1186/s12911-016-0293-4.

Differentially private release of medical microdata: an efficient and practical approach for preserving informative attribute values.医学微观数据的差分隐私发布：一种保护信息属性值的高效实用方法。

BMC Med Inform Decis Mak. 2020 Jul 8;20(1):155. doi: 10.1186/s12911-020-01171-5.

Enabling Health Data Sharing with Fine-Grained Privacy.实现具有细粒度隐私的健康数据共享。

Proc ACM Int Conf Inf Knowl Manag. 2023 Oct;2023:131-141. doi: 10.1145/3583780.3614864. Epub 2023 Oct 21.

On Anonymizing Medical Microdata with Large-Scale Missing Values - A Case Study with the FAERS Dataset.关于使用大规模缺失值对医学微观数据进行匿名化处理——以FAERS数据集为例的研究

Annu Int Conf IEEE Eng Med Biol Soc. 2019 Jul;2019:6505-6508. doi: 10.1109/EMBC.2019.8857025.

Use and Understanding of Anonymization and De-Identification in the Biomedical Literature: Scoping Review.生物医学文献中匿名化和去识别化的使用与理解：范围综述

J Med Internet Res. 2019 May 31;21(5):e13484. doi: 10.2196/13484.

The cost of quality: Implementing generalization and suppression for anonymizing biomedical data with minimal information loss.质量成本：在信息损失最小化的情况下，对生物医学数据进行匿名化处理时实施泛化和抑制。

J Biomed Inform. 2015 Dec;58:37-48. doi: 10.1016/j.jbi.2015.09.007. Epub 2015 Sep 15.

引用本文的文献

Not Fully Synthetic: LLM-based Hybrid Approaches Towards Privacy-Preserving Clinical Note Sharing.非完全合成：基于大语言模型的隐私保护临床笔记共享混合方法。

AMIA Jt Summits Transl Sci Proc. 2025 Jun 10;2025:441-450. eCollection 2025.

Intelligent health in the IS area: A literature review and research agenda.信息系统领域的智能健康：文献综述与研究议程。

Fundam Res. 2023 May 11;4(4):961-971. doi: 10.1016/j.fmre.2023.04.008. eCollection 2024 Jul.

De-identification of free text data containing personal health information: a scoping review of reviews.去标识化包含个人健康信息的自由文本数据：综述的综述。

Int J Popul Data Sci. 2023 Dec 12;8(1):2153. doi: 10.23889/ijpds.v8i1.2153. eCollection 2023.

Privacy Protection and Secondary Use of Health Data: Strategies and Methods.隐私保护与健康数据的二次利用：策略与方法。

Biomed Res Int. 2021 Oct 7;2021:6967166. doi: 10.1155/2021/6967166. eCollection 2021.

Blockchain-Based Medical Records Secure Storage and Medical Service Framework.基于区块链的医疗记录安全存储和医疗服务框架。

J Med Syst. 2018 Nov 22;43(1):5. doi: 10.1007/s10916-018-1121-4.

本文引用的文献

Class Restricted Clustering and Micro-Perturbation for Data Privacy.用于数据隐私的类别受限聚类与微扰动

Manage Sci. 2013 Apr 1;59(4). doi: 10.1287/mnsc.1120.1584.

Mining electronic health records: towards better research applications and clinical care.挖掘电子健康记录：迈向更好的研究应用和临床护理。

Nat Rev Genet. 2012 May 2;13(6):395-405. doi: 10.1038/nrg3208.

Strategies for maintaining patient privacy in i2b2.在 i2b2 中维护患者隐私的策略。

J Am Med Inform Assoc. 2011 Dec;18 Suppl 1(Suppl 1):i103-8. doi: 10.1136/amiajnl-2011-000316. Epub 2011 Oct 7.

Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications.梅奥临床文本分析和知识提取系统（cTAKES）：架构、组件评估和应用。

J Am Med Inform Assoc. 2010 Sep-Oct;17(5):507-13. doi: 10.1136/jamia.2009.001560.

Automatic de-identification of textual documents in the electronic health record: a review of recent research.电子健康记录中文本文件的自动去识别：近期研究综述。

BMC Med Res Methodol. 2010 Aug 2;10:70. doi: 10.1186/1471-2288-10-70.

Serving the enterprise and beyond with informatics for integrating biology and the bedside (i2b2).以整合生物学与床边护理的信息学服务企业及其他领域 (i2b2)。

J Am Med Inform Assoc. 2010 Mar-Apr;17(2):124-30. doi: 10.1136/jamia.2009.000893.

Extracting information from textual documents in the electronic health record: a review of recent research.从电子健康记录中的文本文件提取信息：近期研究综述

Yearb Med Inform. 2008:128-44.

A software tool for removing patient identifying information from clinical documents.从临床文档中删除患者识别信息的软件工具。

J Am Med Inform Assoc. 2008 Sep-Oct;15(5):601-10. doi: 10.1197/jamia.M2702. Epub 2008 Jun 25.

Rapidly retargetable approaches to de-identification in medical records.医疗记录中快速可重新定位的去识别方法。

J Am Med Inform Assoc. 2007 Sep-Oct;14(5):564-73. doi: 10.1197/jamia.M2435. Epub 2007 Jun 28.

Evaluating the state-of-the-art in automatic de-identification.评估自动去识别技术的最新进展。

J Am Med Inform Assoc. 2007 Sep-Oct;14(5):550-63. doi: 10.1197/jamia.M2444. Epub 2007 Jun 28.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

匿名化与共享医学文本记录

Anonymizing and Sharing Medical Text Records.

作者信息

Li Xiao-Bai, Qin Jialun

机构信息

Department of Operations and Information Systems, Manning School of Business, University of Massachusetts Lowell, Lowell, Massachusetts 01854.

出版信息

Inf Syst Res. 2017;28(2):332-352. doi: 10.1287/isre.2016.0676. Epub 2017 Apr 12.

DOI:10.1287/isre.2016.0676

PMID:29569650

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5858761/

Abstract

摘要

匿名化与共享医学文本记录

Anonymizing and Sharing Medical Text Records.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

匿名化与共享医学文本记录

Anonymizing and Sharing Medical Text Records.

作者信息

机构信息

出版信息