• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

在电子病历自由文本中识别与下背痛相关的风险因素:使用临床记录注释的深度学习方法

Identifying Risk Factors Associated With Lower Back Pain in Electronic Medical Record Free Text: Deep Learning Approach Using Clinical Note Annotations.

作者信息

Jaiswal Aman, Katz Alan, Nesca Marcello, Milios Evangelos

机构信息

Faculty of Computer Science, Dalhousie University, Halifax, NS, Canada.

Manitoba Centre for Health Policy, Department of Community Health Sciences, University of Manitoba, Winnipeg, MB, Canada.

出版信息

JMIR Med Inform. 2023 Aug 9;11:e45105. doi: 10.2196/45105.

DOI:10.2196/45105
PMID:37584559
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10461403/
Abstract

BACKGROUND

Lower back pain is a common weakening condition that affects a large population. It is a leading cause of disability and lost productivity, and the associated medical costs and lost wages place a substantial burden on individuals and society. Recent advances in artificial intelligence and natural language processing have opened new opportunities for the identification and management of risk factors for lower back pain. In this paper, we propose and train a deep learning model on a data set of clinical notes that have been annotated with relevant risk factors, and we evaluate the model's performance in identifying risk factors in new clinical notes.

OBJECTIVE

The primary objective is to develop a novel deep learning approach to detect risk factors for underlying disease in patients presenting with lower back pain in clinical encounter notes. The secondary objective is to propose solutions to potential challenges of using deep learning and natural language processing techniques for identifying risk factors in electronic medical record free text and make practical recommendations for future research in this area.

METHODS

We manually annotated clinical notes for the presence of six risk factors for severe underlying disease in patients presenting with lower back pain. Data were highly imbalanced, with only 12% (n=296) of the annotated notes having at least one risk factor. To address imbalanced data, a combination of semantic textual similarity and regular expressions was used to further capture notes for annotation. Further analysis was conducted to study the impact of downsampling, binary formulation of multi-label classification, and unsupervised pretraining on classification performance.

RESULTS

Of 2749 labeled clinical notes, 347 exhibited at least one risk factor, while 2402 exhibited none. The initial analysis shows that downsampling the training set to equalize the ratio of clinical notes with and without risk factors improved the macro-area under the receiver operating characteristic curve (AUROC) by 2%. The Bidirectional Encoder Representations from Transformers (BERT) model improved the macro-AUROC by 15% over the traditional machine learning baseline. In experiment 2, the proposed BERT-convolutional neural network (CNN) model for longer texts improved (4% macro-AUROC) over the BERT baseline, and the multitask models are more stable for minority classes. In experiment 3, domain adaptation of BERTCNN using masked language modeling improved the macro-AUROC by 2%.

CONCLUSIONS

Primary care clinical notes are likely to require manipulation to perform meaningful free-text analysis. The application of BERT models for multi-label classification on downsampled annotated clinical notes is useful in detecting risk factors suggesting an indication for imaging for patients with lower back pain.

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0b7e/10461403/e0b3923ea491/medinform-v11-e45105-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0b7e/10461403/9dced823ec1d/medinform-v11-e45105-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0b7e/10461403/0370fb58d769/medinform-v11-e45105-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0b7e/10461403/108e9ce218e4/medinform-v11-e45105-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0b7e/10461403/cc1d5dd15ebb/medinform-v11-e45105-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0b7e/10461403/e0b3923ea491/medinform-v11-e45105-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0b7e/10461403/9dced823ec1d/medinform-v11-e45105-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0b7e/10461403/0370fb58d769/medinform-v11-e45105-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0b7e/10461403/108e9ce218e4/medinform-v11-e45105-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0b7e/10461403/cc1d5dd15ebb/medinform-v11-e45105-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0b7e/10461403/e0b3923ea491/medinform-v11-e45105-g005.jpg
摘要

背景

下背痛是一种常见的使人身体衰弱的病症,影响着大量人群。它是导致残疾和生产力丧失的主要原因,相关的医疗费用和工资损失给个人和社会带来了沉重负担。人工智能和自然语言处理的最新进展为下背痛风险因素的识别和管理带来了新机遇。在本文中,我们在已标注相关风险因素的临床记录数据集上提出并训练了一个深度学习模型,并评估该模型在识别新临床记录中的风险因素方面的性能。

目的

主要目标是开发一种新颖的深度学习方法,以在临床会诊记录中检测下背痛患者潜在疾病的风险因素。次要目标是针对使用深度学习和自然语言处理技术在电子病历自由文本中识别风险因素的潜在挑战提出解决方案,并为该领域的未来研究提出实用建议。

方法

我们手动标注了下背痛患者严重潜在疾病六个风险因素的临床记录。数据高度不平衡,只有12%(n = 296)的标注记录有至少一个风险因素。为解决数据不平衡问题,使用语义文本相似度和正则表达式的组合进一步获取用于标注的记录。进行了进一步分析,以研究下采样、多标签分类的二元化以及无监督预训练对分类性能的影响。

结果

在2749条标注的临床记录中,347条显示至少一个风险因素,而2402条未显示任何风险因素。初步分析表明,对训练集进行下采样以均衡有和没有风险因素的临床记录比例,使接收器操作特征曲线(AUROC)下的宏面积提高了2%。与传统机器学习基线相比,来自变换器的双向编码器表示(BERT)模型使宏AUROC提高了15%。在实验2中,针对较长文本提出的BERT - 卷积神经网络(CNN)模型比BERT基线有所改进(宏AUROC提高4%),并且多任务模型对少数类更稳定。在实验3中,使用掩码语言建模对BERTCNN进行领域适应使宏AUROC提高了2%。

结论

基层医疗临床记录可能需要进行处理以进行有意义的自由文本分析。在经过下采样的标注临床记录上应用BERT模型进行多标签分类,有助于检测提示下背痛患者进行影像学检查指征的风险因素。

相似文献

1
Identifying Risk Factors Associated With Lower Back Pain in Electronic Medical Record Free Text: Deep Learning Approach Using Clinical Note Annotations.在电子病历自由文本中识别与下背痛相关的风险因素:使用临床记录注释的深度学习方法
JMIR Med Inform. 2023 Aug 9;11:e45105. doi: 10.2196/45105.
2
Classifying social determinants of health from unstructured electronic health records using deep learning-based natural language processing.利用基于深度学习的自然语言处理技术从非结构化电子健康记录中分类社会健康决定因素。
J Biomed Inform. 2022 Mar;127:103984. doi: 10.1016/j.jbi.2021.103984. Epub 2022 Jan 7.
3
Identification of Semantically Similar Sentences in Clinical Notes: Iterative Intermediate Training Using Multi-Task Learning.临床笔记中语义相似句子的识别:使用多任务学习的迭代中间训练
JMIR Med Inform. 2020 Nov 27;8(11):e22508. doi: 10.2196/22508.
4
A comparative study on deep learning models for text classification of unstructured medical notes with various levels of class imbalance.深度学习模型在不同类别不平衡程度的非结构化医疗记录文本分类中的对比研究。
BMC Med Res Methodol. 2022 Jul 2;22(1):181. doi: 10.1186/s12874-022-01665-y.
5
Predicting Postoperative Mortality With Deep Neural Networks and Natural Language Processing: Model Development and Validation.使用深度神经网络和自然语言处理预测术后死亡率:模型开发与验证
JMIR Med Inform. 2022 May 10;10(5):e38241. doi: 10.2196/38241.
6
Incorporating Domain Knowledge Into Language Models by Using Graph Convolutional Networks for Assessing Semantic Textual Similarity: Model Development and Performance Comparison.通过使用图卷积网络将领域知识融入语言模型以评估语义文本相似度:模型开发与性能比较
JMIR Med Inform. 2021 Nov 26;9(11):e23101. doi: 10.2196/23101.
7
Multi-Label Classification in Patient-Doctor Dialogues With the RoBERTa-WWM-ext + CNN (Robustly Optimized Bidirectional Encoder Representations From Transformers Pretraining Approach With Whole Word Masking Extended Combining a Convolutional Neural Network) Model: Named Entity Study.基于RoBERTa-WWM-ext + CNN(带有全词掩码扩展的基于变换器预训练方法的稳健优化双向编码器表示与卷积神经网络相结合)模型的医患对话多标签分类:命名实体研究
JMIR Med Inform. 2022 Apr 21;10(4):e35606. doi: 10.2196/35606.
8
Artificial Intelligence-Based Multimodal Risk Assessment Model for Surgical Site Infection (AMRAMS): Development and Validation Study.基于人工智能的手术部位感染多模态风险评估模型(AMRAMS):开发与验证研究
JMIR Med Inform. 2020 Jun 15;8(6):e18186. doi: 10.2196/18186.
9
The 2019 n2c2/OHNLP Track on Clinical Semantic Textual Similarity: Overview.2019年n2c2/OHNLP临床语义文本相似性赛道:概述
JMIR Med Inform. 2020 Nov 27;8(11):e23375. doi: 10.2196/23375.
10
Relation Classification for Bleeding Events From Electronic Health Records Using Deep Learning Systems: An Empirical Study.使用深度学习系统对电子健康记录中的出血事件进行关系分类:一项实证研究。
JMIR Med Inform. 2021 Jul 2;9(7):e27527. doi: 10.2196/27527.

引用本文的文献

1
The Growing Impact of Natural Language Processing in Healthcare and Public Health.自然语言处理在医疗保健和公共卫生领域的影响日益扩大。
Inquiry. 2024 Jan-Dec;61:469580241290095. doi: 10.1177/00469580241290095.

本文引用的文献

1
A comparative study of pretrained language models for long clinical text.基于预训练语言模型的长临床文本比较研究
J Am Med Inform Assoc. 2023 Jan 18;30(2):340-347. doi: 10.1093/jamia/ocac225.
2
Strategies to Address the Lack of Labeled Data for Supervised Machine Learning Training With Electronic Health Records: Case Study for the Extraction of Symptoms From Clinical Notes.应对电子健康记录监督式机器学习训练中标记数据不足的策略:从临床笔记中提取症状的案例研究
JMIR Med Inform. 2022 Mar 14;10(3):e32903. doi: 10.2196/32903.
3
Low Back Pain.腰痛。
Ann Intern Med. 2021 Aug;174(8):ITC113-ITC128. doi: 10.7326/AITC202108170. Epub 2021 Aug 10.
4
Risk Factors Associated With Transition From Acute to Chronic Low Back Pain in US Patients Seeking Primary Care.与美国寻求初级保健的患者从急性腰痛向慢性腰痛转变相关的风险因素。
JAMA Netw Open. 2021 Feb 1;4(2):e2037371. doi: 10.1001/jamanetworkopen.2020.37371.
5
Identifying Acute Low Back Pain Episodes in Primary Care Practice From Clinical Notes: Observational Study.从临床记录中识别基层医疗实践中的急性腰痛发作:观察性研究。
JMIR Med Inform. 2020 Feb 27;8(2):e16878. doi: 10.2196/16878.
6
Low Back Pain Treatment by Athletic Trainers and Athletic Therapists: Biomedical or Biopsychosocial Orientation?运动训练师和运动治疗师治疗下背痛:生物医学还是生物心理社会取向?
J Athl Train. 2019 Jul;54(7):772-779. doi: 10.4085/1062-6050-430-17. Epub 2019 Aug 6.
7
Real-world incidence and prevalence of low back pain using routinely collected data.基于常规数据的腰痛真实世界发生率和患病率。
Rheumatol Int. 2019 Apr;39(4):619-626. doi: 10.1007/s00296-019-04273-0. Epub 2019 Mar 8.
8
A guide to deep learning in healthcare.深度学习在医疗保健中的应用指南。
Nat Med. 2019 Jan;25(1):24-29. doi: 10.1038/s41591-018-0316-z. Epub 2019 Jan 7.
9
The Use of Imaging in Management of Patients with Low Back Pain.影像学在腰痛患者管理中的应用
J Clin Imaging Sci. 2018 Aug 24;8:30. doi: 10.4103/jcis.JCIS_16_18. eCollection 2018.
10
MIMIC-III, a freely accessible critical care database.MIMIC-III,一个免费获取的重症监护数据库。
Sci Data. 2016 May 24;3:160035. doi: 10.1038/sdata.2016.35.