Suppr超能文献

利用自然语言处理对具有网络风险的临床记录进行分类

Classifying Cyber-Risky Clinical Notes by Employing Natural Language Processing.

作者信息

Schmeelk Suzanna, Dogo Martins Samuel, Peng Yifan, Patra Braja Gopal

机构信息

St. John's University, Queens, New York.

Queen's University Belfast, United Kingdom.

出版信息

Proc Annu Hawaii Int Conf Syst Sci. 2022;2022:4140-4146. doi: 10.24251/hicss.2022.505. Epub 2022 Jan 4.

Abstract

Clinical notes, which can be embedded into electronic medical records, document patient care delivery and summarize interactions between healthcare providers and patients. These clinical notes directly inform patient care and can also indirectly inform research and quality/safety metrics, among other indirect metrics. Recently, some states within the United States of America require patients to have open access to their clinical notes to improve the exchange of patient information for patient care. Thus, developing methods to assess the cyber risks of clinical notes before sharing and exchanging data is critical. While existing natural language processing techniques are geared to de-identify clinical notes, to the best of our knowledge, few have focused on classifying sensitive-information risk, which is a fundamental step toward developing effective, widespread protection of patient health information. To bridge this gap, this research investigates methods for identifying security/privacy risks within clinical notes. The classification either can be used upstream to identify areas within notes that likely contain sensitive information or downstream to improve the identification of clinical notes that have not been entirely de-identified. We develop several models using unigram and word2vec features with different classifiers to categorize sentence risk. Experiments on i2b2 de-identification dataset show that the SVM classifier using word2vec features obtained a maximum F1-score of 0.792. Future research involves articulation and differentiation of risk in terms of different global regulatory requirements.

摘要

临床记录可嵌入电子病历中,记录患者护理情况,并总结医疗服务提供者与患者之间的互动。这些临床记录直接为患者护理提供信息,也可间接为研究以及质量/安全指标等其他间接指标提供信息。最近,美国的一些州要求患者能够公开获取自己的临床记录,以改善患者护理中患者信息的交换。因此,在共享和交换数据之前开发评估临床记录网络风险的方法至关重要。虽然现有的自然语言处理技术旨在对临床记录进行去识别处理,但据我们所知,很少有技术专注于对敏感信息风险进行分类,而这是朝着有效、广泛地保护患者健康信息迈出的关键一步。为了弥补这一差距,本研究调查了识别临床记录中安全/隐私风险的方法。这种分类既可以在流程上游用于识别记录中可能包含敏感信息的区域,也可以在流程下游用于改进对尚未完全去识别的临床记录的识别。我们使用一元语法和词向量特征以及不同的分类器开发了几种模型,对句子风险进行分类。在i2b2去识别数据集上进行的实验表明,使用词向量特征的支持向量机分类器获得的最大F1分数为0.792。未来的研究包括根据不同的全球监管要求阐明和区分风险。

相似文献

1
Classifying Cyber-Risky Clinical Notes by Employing Natural Language Processing.利用自然语言处理对具有网络风险的临床记录进行分类
Proc Annu Hawaii Int Conf Syst Sci. 2022;2022:4140-4146. doi: 10.24251/hicss.2022.505. Epub 2022 Jan 4.
8
An Accurate Deep Learning Model for Clinical Entity Recognition From Clinical Notes.一种从临床笔记中识别临床实体的精确深度学习模型。
IEEE J Biomed Health Inform. 2021 Oct;25(10):3804-3811. doi: 10.1109/JBHI.2021.3099755. Epub 2021 Oct 5.
10
Classifying clinical notes with pain assessment using machine learning.使用机器学习对临床记录进行疼痛评估分类。
Med Biol Eng Comput. 2018 Jul;56(7):1285-1292. doi: 10.1007/s11517-017-1772-1. Epub 2017 Dec 26.

本文引用的文献

3
Entity recognition from clinical texts via recurrent neural network.基于循环神经网络的临床文本实体识别。
BMC Med Inform Decis Mak. 2017 Jul 5;17(Suppl 2):67. doi: 10.1186/s12911-017-0468-7.
5
Creation of a new longitudinal corpus of clinical narratives.创建一个新的临床叙事纵向语料库。
J Biomed Inform. 2015 Dec;58 Suppl(Suppl):S6-S10. doi: 10.1016/j.jbi.2015.09.018. Epub 2015 Oct 1.
7
CRFs based de-identification of medical records.基于病例报告表的医疗记录去识别化处理。
J Biomed Inform. 2015 Dec;58 Suppl(Suppl):S39-S46. doi: 10.1016/j.jbi.2015.08.012. Epub 2015 Aug 24.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验