• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

加拿大安大略省初级保健电子病历自由文本数据的去识别化。

De-identification of primary care electronic medical records free-text data in Ontario, Canada.

机构信息

Institute for Clinical Evaluative Sciences G106, Toronto, Ontario, M4N 3M5, Canada.

出版信息

BMC Med Inform Decis Mak. 2010 Jun 18;10:35. doi: 10.1186/1472-6947-10-35.

DOI:10.1186/1472-6947-10-35
PMID:20565894
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2907300/
Abstract

BACKGROUND

Electronic medical records (EMRs) represent a potentially rich source of health information for research but the free-text in EMRs often contains identifying information. While de-identification tools have been developed for free-text, none have been developed or tested for the full range of primary care EMR data

METHODS

We used deid open source de-identification software and modified it for an Ontario context for use on primary care EMR data. We developed the modified program on a training set of 1000 free-text records from one group practice and then tested it on two validation sets from a random sample of 700 free-text EMR records from 17 different physicians from 7 different practices in 5 different cities and 500 free-text records from a group practice that was in a different city than the group practice that was used for the training set. We measured the sensitivity/recall, precision, specificity, accuracy and F-measure of the modified tool against manually tagged free-text records to remove patient and physician names, locations, addresses, medical record, health card and telephone numbers.

RESULTS

We found that the modified training program performed with a sensitivity of 88.3%, specificity of 91.4%, precision of 91.3%, accuracy of 89.9% and F-measure of 0.90. The validations sets had sensitivities of 86.7% and 80.2%, specificities of 91.4% and 87.7%, precisions of 91.1% and 87.4%, accuracies of 89.0% and 83.8% and F-measures of 0.89 and 0.84 for the first and second validation sets respectively.

CONCLUSION

The deid program can be modified to reasonably accurately de-identify free-text primary care EMR records while preserving clinical content.

摘要

背景

电子病历(EMR)代表了一种潜在的丰富的健康信息来源,但 EMR 中的自由文本通常包含可识别信息。虽然已经开发了用于自由文本的去识别工具,但没有一个工具是针对整个初级保健 EMR 数据范围开发或测试的。

方法

我们使用了 deid 开源去识别软件,并对其进行了修改,使其适用于安大略省的初级保健 EMR 数据。我们在一个组实践的 1000 份自由文本记录的训练集中开发了修改后的程序,然后在来自 7 个不同城市的 5 个不同实践的 17 个不同医生的 700 份自由文本 EMR 记录的两个验证集中测试了它,以及 500 份来自与训练集所在组实践不同城市的组实践的自由文本记录。我们使用手动标记的自由文本记录来衡量修改后的工具的灵敏度/召回率、精度、特异性、准确性和 F 度量,以去除患者和医生的姓名、地点、地址、医疗记录、健康卡和电话号码。

结果

我们发现,修改后的培训计划的灵敏度为 88.3%,特异性为 91.4%,精度为 91.3%,准确性为 89.9%,F 度量为 0.90。验证集的灵敏度分别为 86.7%和 80.2%,特异性分别为 91.4%和 87.7%,精度分别为 91.1%和 87.4%,准确性分别为 89.0%和 83.8%,F 度量分别为 0.89 和 0.84。

结论

deid 程序可以进行修改,以合理准确地去除自由文本初级保健 EMR 记录中的可识别信息,同时保留临床内容。

相似文献

1
De-identification of primary care electronic medical records free-text data in Ontario, Canada.加拿大安大略省初级保健电子病历自由文本数据的去识别化。
BMC Med Inform Decis Mak. 2010 Jun 18;10:35. doi: 10.1186/1472-6947-10-35.
2
De-identifying an EHR database - anonymity, correctness and readability of the medical record.对电子健康记录数据库进行去识别处理——医疗记录的匿名性、准确性和可读性。
Stud Health Technol Inform. 2011;169:862-6.
3
The MITRE Identification Scrubber Toolkit: design, training, and assessment.MITRE 识别清理工具包:设计、培训和评估。
Int J Med Inform. 2010 Dec;79(12):849-59. doi: 10.1016/j.ijmedinf.2010.09.007. Epub 2010 Oct 14.
4
Automated de-identification of free-text medical records.自由文本医疗记录的自动去识别化
BMC Med Inform Decis Mak. 2008 Jul 24;8:32. doi: 10.1186/1472-6947-8-32.
5
Effects of personal identifier resynthesis on clinical text de-identification.个人标识符再合成对临床文本去识别的影响。
J Am Med Inform Assoc. 2010 Mar-Apr;17(2):159-68. doi: 10.1136/jamia.2009.002212.
6
Proposal and evaluation of FASDIM, a Fast And Simple De-Identification Method for unstructured free-text clinical records.提出并评估了 FASDIM,一种用于非结构化自由文本临床记录的快速简便去识别方法。
Int J Med Inform. 2014 Apr;83(4):303-12. doi: 10.1016/j.ijmedinf.2013.11.005. Epub 2013 Dec 7.
7
Developing a standard for de-identifying electronic patient records written in Swedish: precision, recall and F-measure in a manual and computerized annotation trial.开发一种用于去除瑞典语电子病历中标识符的标准:手动和计算机化注释试验中的精度、召回率和 F 度量。
Int J Med Inform. 2009 Dec;78(12):e19-26. doi: 10.1016/j.ijmedinf.2009.04.005. Epub 2009 May 23.
8
Are family physicians comprehensively using electronic medical records such that the data can be used for secondary purposes? A Canadian perspective.家庭医生是否在全面使用电子病历,以便这些数据可用于次要目的?加拿大的视角。
BMC Med Inform Decis Mak. 2015 Aug 13;15:67. doi: 10.1186/s12911-015-0195-x.
9
Utility of linking primary care electronic medical records with Canadian census data to study the determinants of chronic disease: an example based on socioeconomic status and obesity.将初级保健电子病历与加拿大人口普查数据相链接以研究慢性病决定因素的效用:基于社会经济地位和肥胖的实例
BMC Med Inform Decis Mak. 2016 Mar 11;16:32. doi: 10.1186/s12911-016-0272-9.
10
Development and evaluation of an open source software tool for deidentification of pathology reports.用于病理报告去识别化的开源软件工具的开发与评估
BMC Med Inform Decis Mak. 2006 Mar 6;6:12. doi: 10.1186/1472-6947-6-12.

引用本文的文献

1
A Novel COVID-19 Data Set and an Effective Deep Learning Approach for the De-Identification of Italian Medical Records.一个用于意大利医疗记录去识别化的新型新冠病毒数据集及有效的深度学习方法。
IEEE Access. 2021 Jan 25;9:19097-19110. doi: 10.1109/ACCESS.2021.3054479. eCollection 2021.
2
P2P watch: personal health information detection in peer-to-peer file-sharing networks.P2P 监测:对等文件共享网络中的个人健康信息检测
J Med Internet Res. 2012 Jul 9;14(4):e95. doi: 10.2196/jmir.1898.
3
Strategies for de-identification and anonymization of electronic health record data for use in multicenter research studies.用于多中心研究的电子健康记录数据去识别和匿名化策略。
Med Care. 2012 Jul;50 Suppl(Suppl):S82-101. doi: 10.1097/MLR.0b013e3182585355.
4
Methods for the de-identification of electronic health records for genomic research.用于基因组研究的电子健康记录去识别方法。
Genome Med. 2011 Apr 27;3(4):25. doi: 10.1186/gm239.
5
Feedback GAP: study protocol for a cluster-randomized trial of goal setting and action plans to increase the effectiveness of audit and feedback interventions in primary care.反馈差距:一项群组随机试验的研究方案,旨在通过制定目标和行动计划来提高初级保健中审核和反馈干预措施的效果。
Implement Sci. 2010 Dec 17;5:98. doi: 10.1186/1748-5908-5-98.

本文引用的文献

1
Testing tactics to localize de-identification.测试定位去识别化的策略。
Stud Health Technol Inform. 2009;150:735-9.
2
Developing a standard for de-identifying electronic patient records written in Swedish: precision, recall and F-measure in a manual and computerized annotation trial.开发一种用于去除瑞典语电子病历中标识符的标准:手动和计算机化注释试验中的精度、召回率和 F 度量。
Int J Med Inform. 2009 Dec;78(12):e19-26. doi: 10.1016/j.ijmedinf.2009.04.005. Epub 2009 May 23.
3
Using data from electronic medical records: theory versus practice.利用电子病历数据:理论与实践
Healthc Q. 2008;11(4):23-5. doi: 10.12927/hcq.2008.20088.
4
Automated de-identification of free-text medical records.自由文本医疗记录的自动去识别化
BMC Med Inform Decis Mak. 2008 Jul 24;8:32. doi: 10.1186/1472-6947-8-32.
5
A de-identifier for medical discharge summaries.一份用于出院小结的去标识信息。
Artif Intell Med. 2008 Jan;42(1):13-35. doi: 10.1016/j.artmed.2007.10.001. Epub 2007 Nov 28.
6
State-of-the-art anonymization of medical records using an iterative machine learning framework.使用迭代机器学习框架对病历进行最先进的匿名化处理。
J Am Med Inform Assoc. 2007 Sep-Oct;14(5):574-80. doi: 10.1197/j.jamia.M2441.
7
Rapidly retargetable approaches to de-identification in medical records.医疗记录中快速可重新定位的去识别方法。
J Am Med Inform Assoc. 2007 Sep-Oct;14(5):564-73. doi: 10.1197/jamia.M2435. Epub 2007 Jun 28.
8
Evaluating the state-of-the-art in automatic de-identification.评估自动去识别技术的最新进展。
J Am Med Inform Assoc. 2007 Sep-Oct;14(5):550-63. doi: 10.1197/jamia.M2444. Epub 2007 Jun 28.
9
Development and evaluation of an open source software tool for deidentification of pathology reports.用于病理报告去识别化的开源软件工具的开发与评估
BMC Med Inform Decis Mak. 2006 Mar 6;6:12. doi: 10.1186/1472-6947-6-12.
10
Evaluation of a deidentification (De-Id) software engine to share pathology reports and clinical documents for research.评估一种用于共享病理学报告和临床文档以进行研究的去识别化(De-Id)软件引擎。
Am J Clin Pathol. 2004 Feb;121(2):176-86. doi: 10.1309/E6K3-3GBP-E5C2-7FYU.