通过审查对具有重复诊断的行政计费代码进行匿名化处理。

Anonymization of administrative billing codes with repeated diagnoses through censoring.

作者信息

Tamersoy Acar, Loukides Grigorios, Denny Joshua C, Malin Bradley

机构信息

Department of Biomedical Informatics, School of Medicine Vanderbilt University, Nashville, Tennessee.

出版信息

AMIA Annu Symp Proc. 2010 Nov 13;2010:782-6.

PMID:21347085

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3041421/

Abstract

Patient-specific data from electronic medical records (EMRs) is increasingly shared in a de-identified form to support research. However, EMRs are susceptible to noise, error, and variation, which can limit their utility for reuse. One way to enhance the utility of EMRs is to record the number of times diagnosis codes are assigned to a patient when this data is shared. This is, however, challenging because releasing such data may be leveraged to compromise patients' identity. In this paper, we present an approach that, to the best of our knowledge, is the first that can prevent re-identification through repeated diagnosis codes. Our method transforms records to preserve privacy while retaining much of their utility. Experiments conducted using 2676 patients from the EMR system of the Vanderbilt University Medical Center verify that our method is able to retain an average of 95.4% of the diagnosis codes in a common data sharing scenario.

摘要

来自电子病历（EMR）的患者特定数据越来越多地以去标识化的形式共享，以支持研究。然而，电子病历容易受到噪声、错误和变异的影响，这可能会限制其重复使用的效用。提高电子病历效用的一种方法是在共享此数据时记录分配给患者的诊断代码的次数。然而，这具有挑战性，因为发布此类数据可能会被用于泄露患者身份。在本文中，据我们所知，我们提出了一种能够通过重复的诊断代码防止重新识别的方法。我们的方法在保留记录大部分效用的同时对其进行转换以保护隐私。使用范德比尔特大学医学中心电子病历系统的2676名患者进行的实验证实，在常见的数据共享场景中，我们的方法能够平均保留95.4%的诊断代码。

相似文献

Anonymization of administrative billing codes with repeated diagnoses through censoring.通过审查对具有重复诊断的行政计费代码进行匿名化处理。

AMIA Annu Symp Proc. 2010 Nov 13;2010:782-6.

Utility-aware anonymization of diagnosis codes.基于效用感知的诊断码匿名化。

IEEE J Biomed Health Inform. 2013 Jan;17(1):60-70. doi: 10.1109/TITB.2012.2212281. Epub 2012 Aug 8.

Ensuring electronic medical record simulation through better training, modeling, and evaluation.通过更好的培训、建模和评估来确保电子病历模拟。

J Am Med Inform Assoc. 2020 Jan 1;27(1):99-108. doi: 10.1093/jamia/ocz161.

Anonymization of longitudinal electronic medical records.纵向电子病历的匿名化处理

IEEE Trans Inf Technol Biomed. 2012 May;16(3):413-23. doi: 10.1109/TITB.2012.2185850. Epub 2012 Jan 27.

Privacy-Preserving in Healthcare Blockchain Systems Based on Lightweight Message Sharing.基于轻量级消息共享的医疗保健区块链系统中的隐私保护。

Sensors (Basel). 2020 Mar 29;20(7):1898. doi: 10.3390/s20071898.

Anonymizing datasets with demographics and diagnosis codes in the presence of utility constraints.在存在效用约束的情况下，对包含人口统计学和诊断代码的数据集进行匿名化处理。

J Biomed Inform. 2017 Jan;65:76-96. doi: 10.1016/j.jbi.2016.11.001. Epub 2016 Nov 8.

The disclosure of diagnosis codes can breach research participants' privacy.诊断编码的披露可能会侵犯研究参与者的隐私。

J Am Med Inform Assoc. 2010 May-Jun;17(3):322-7. doi: 10.1136/jamia.2009.002725.

Privacy-preserving data cube for electronic medical records: An experimental evaluation.用于电子病历的隐私保护数据立方体：实验评估

Int J Med Inform. 2017 Jan;97:33-42. doi: 10.1016/j.ijmedinf.2016.09.008. Epub 2016 Sep 24.

A multi-institution evaluation of clinical profile anonymization.多机构临床资料匿名化评估

J Am Med Inform Assoc. 2016 Apr;23(e1):e131-7. doi: 10.1093/jamia/ocv154. Epub 2015 Nov 13.

SynTEG: a framework for temporal structured electronic health data simulation.SynTEG：用于时间结构化电子健康数据模拟的框架。

J Am Med Inform Assoc. 2021 Mar 1;28(3):596-604. doi: 10.1093/jamia/ocaa262.

引用本文的文献

The Anonymous Data Warehouse: A Hands-On Framework for Anonymizing Data From Digital Health Applications.匿名数据仓库：一个用于对数字健康应用程序中的数据进行匿名化处理的实用框架。

Cureus. 2024 Apr 3;16(4):e57519. doi: 10.7759/cureus.57519. eCollection 2024 Apr.

Use and Understanding of Anonymization and De-Identification in the Biomedical Literature: Scoping Review.生物医学文献中匿名化和去识别化的使用与理解：范围综述

J Med Internet Res. 2019 May 31;21(5):e13484. doi: 10.2196/13484.

Genetic data sharing and privacy.遗传数据共享与隐私

Neuroinformatics. 2015 Jan;13(1):1-6. doi: 10.1007/s12021-014-9248-z.

The Electronic Medical Records and Genomics (eMERGE) Network: past, present, and future.电子病历与基因组学（eMERGE）网络：过去、现在和未来。

Genet Med. 2013 Oct;15(10):761-71. doi: 10.1038/gim.2013.72. Epub 2013 Jun 6.

Reducing patient re-identification risk for laboratory results within research datasets.降低研究数据集内实验室结果的患者再识别风险。

J Am Med Inform Assoc. 2013 Jan 1;20(1):95-101. doi: 10.1136/amiajnl-2012-001026. Epub 2012 Jul 21.

De-identification methods for open health data: the case of the Heritage Health Prize claims dataset.开放健康数据的去识别方法：以传统健康奖索赔数据集为例。

J Med Internet Res. 2012 Feb 27;14(1):e33. doi: 10.2196/jmir.2001.

Anonymization of longitudinal electronic medical records.纵向电子病历的匿名化处理

IEEE Trans Inf Technol Biomed. 2012 May;16(3):413-23. doi: 10.1109/TITB.2012.2185850. Epub 2012 Jan 27.

Attribute Utility Motivated k-anonymization of datasets to support the heterogeneous needs of biomedical researchers.用于支持生物医学研究人员异构需求的数据集属性效用驱动的k匿名化。

AMIA Annu Symp Proc. 2011;2011:1573-82. Epub 2011 Oct 22.

本文引用的文献

The disclosure of diagnosis codes can breach research participants' privacy.诊断编码的披露可能会侵犯研究参与者的隐私。

J Am Med Inform Assoc. 2010 May-Jun;17(3):322-7. doi: 10.1136/jamia.2009.002725.

Anonymization of electronic medical records for validating genome-wide association studies.电子病历的匿名化用于验证全基因组关联研究。

Proc Natl Acad Sci U S A. 2010 Apr 27;107(17):7898-903. doi: 10.1073/pnas.0911686107. Epub 2010 Apr 12.

Robust replication of genotype-phenotype associations across multiple diseases in an electronic medical record.在电子病历中，多种疾病的基因型-表型关联具有强大的复制能力。

Am J Hum Genet. 2010 Apr 9;86(4):560-72. doi: 10.1016/j.ajhg.2010.03.003. Epub 2010 Apr 1.

Evaluating re-identification risks with respect to the HIPAA privacy rule.评估 HIPAA 隐私规则下的重新识别风险。

J Am Med Inform Assoc. 2010 Mar-Apr;17(2):169-77. doi: 10.1136/jamia.2009.000026.

Development of a large-scale de-identified DNA biobank to enable personalized medicine.开发一个大规模的去识别化DNA生物样本库以实现个性化医疗。

Clin Pharmacol Ther. 2008 Sep;84(3):362-9. doi: 10.1038/clpt.2008.89. Epub 2008 May 21.

The NCBI dbGaP database of genotypes and phenotypes.美国国立医学图书馆的基因型和表型数据库（NCBI dbGaP）。

Nat Genet. 2007 Oct;39(10):1181-6. doi: 10.1038/ng1007-1181.

Identifying diagnostic errors in primary care using an electronic screening algorithm.使用电子筛查算法识别基层医疗中的诊断错误。

Arch Intern Med. 2007 Feb 12;167(3):302-8. doi: 10.1001/archinte.167.3.302.

Evaluating common de-identification heuristics for personal health information.评估个人健康信息的常见去识别启发式方法。

J Med Internet Res. 2006 Nov 21;8(4):e28. doi: 10.2196/jmir.8.4.e28.

An evaluation of the current state of genomic data privacy protection technology and a roadmap for the future.基因组数据隐私保护技术的现状评估与未来路线图。

J Am Med Inform Assoc. 2005 Jan-Feb;12(1):28-34. doi: 10.1197/jamia.M1603. Epub 2004 Oct 18.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验