• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

从电子健康记录临床笔记中提取老年综合征:统计自然语言处理方法评估

Extraction of Geriatric Syndromes From Electronic Health Record Clinical Notes: Assessment of Statistical Natural Language Processing Methods.

作者信息

Chen Tao, Dredze Mark, Weiner Jonathan P, Hernandez Leilani, Kimura Joe, Kharrazi Hadi

机构信息

Center for Language and Speech Processing, Whiting School of Engineering, Johns Hopkins University, Baltimore, MD, United States.

Department of Computer Science, Whiting School of Engineering, Johns Hopkins University, Baltimore, MD, United States.

出版信息

JMIR Med Inform. 2019 Mar 26;7(1):e13039. doi: 10.2196/13039.

DOI:10.2196/13039
PMID:30862607
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6454337/
Abstract

BACKGROUND

Geriatric syndromes in older adults are associated with adverse outcomes. However, despite being reported in clinical notes, these syndromes are often poorly captured by diagnostic codes in the structured fields of electronic health records (EHRs) or administrative records.

OBJECTIVE

We aim to automatically determine if a patient has any geriatric syndromes by mining the free text of associated EHR clinical notes. We assessed which statistical natural language processing (NLP) techniques are most effective.

METHODS

We applied conditional random fields (CRFs), a widely used machine learning algorithm, to identify each of 10 geriatric syndrome constructs in a clinical note. We assessed three sets of features and attributes for CRF operations: a base set, enhanced token, and contextual features. We trained the CRF on 3901 manually annotated notes from 85 patients, tuned the CRF on a validation set of 50 patients, and evaluated it on 50 held-out test patients. These notes were from a group of US Medicare patients over 65 years of age enrolled in a Medicare Advantage Health Maintenance Organization and cared for by a large group practice in Massachusetts.

RESULTS

A final feature set was formed through comprehensive feature ablation experiments. The final CRF model performed well at patient-level determination (macroaverage F1=0.834, microaverage F1=0.851); however, performance varied by construct. For example, at phrase-partial evaluation, the CRF model worked well on constructs such as absence of fecal control (F1=0.857) and vision impairment (F1=0.798) but poorly on malnutrition (F1=0.155), weight loss (F1=0.394), and severe urinary control issues (F1=0.532). Errors were primarily due to previously unobserved words (ie, out-of-vocabulary) and a lack of context.

CONCLUSIONS

This study shows that statistical NLP can be used to identify geriatric syndromes from EHR-extracted clinical notes. This creates new opportunities to identify patients with geriatric syndromes and study their health outcomes.

摘要

背景

老年人的老年综合征与不良后果相关。然而,尽管这些综合征在临床记录中有报告,但在电子健康记录(EHR)或行政记录的结构化字段中,诊断代码往往难以准确捕捉到这些综合征。

目的

我们旨在通过挖掘相关EHR临床记录的自由文本,自动确定患者是否患有任何老年综合征。我们评估了哪些统计自然语言处理(NLP)技术最有效。

方法

我们应用条件随机字段(CRF),一种广泛使用的机器学习算法,在临床记录中识别10种老年综合征结构中的每一种。我们评估了用于CRF操作的三组特征和属性:基础集、增强令牌和上下文特征。我们在来自85名患者的3901份人工标注记录上训练CRF,在50名患者的验证集上调整CRF,并在50名预留测试患者上进行评估。这些记录来自一组年龄超过65岁的美国医疗保险患者,他们参加了医疗保险优势健康维护组织,并由马萨诸塞州的一个大型团体诊所提供护理。

结果

通过全面的特征消融实验形成了最终的特征集。最终的CRF模型在患者水平的判定上表现良好(宏观平均F1 = 0.834,微观平均F1 = 0.851);然而,不同结构的表现有所不同。例如,在短语部分评估中,CRF模型在诸如大便失禁(F1 = 0.857)和视力障碍(F1 = 0.798)等结构上表现良好,但在营养不良(F1 = 0.155)、体重减轻(F1 = 0.394)和严重的排尿控制问题(F1 = 0.532)上表现不佳。错误主要是由于以前未观察到的单词(即词汇外)和缺乏上下文。

结论

本研究表明,统计NLP可用于从EHR提取的临床记录中识别老年综合征。这为识别患有老年综合征的患者并研究他们的健康结果创造了新的机会。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7642/6454337/f939dacb5180/medinform_v7i1e13039_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7642/6454337/f939dacb5180/medinform_v7i1e13039_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7642/6454337/f939dacb5180/medinform_v7i1e13039_fig1.jpg

相似文献

1
Extraction of Geriatric Syndromes From Electronic Health Record Clinical Notes: Assessment of Statistical Natural Language Processing Methods.从电子健康记录临床笔记中提取老年综合征:统计自然语言处理方法评估
JMIR Med Inform. 2019 Mar 26;7(1):e13039. doi: 10.2196/13039.
2
Identifying vulnerable older adult populations by contextualizing geriatric syndrome information in clinical notes of electronic health records.通过在电子健康记录的临床记录中对老年综合征信息进行情境化处理来识别脆弱的老年人群体。
J Am Med Inform Assoc. 2019 Aug 1;26(8-9):787-795. doi: 10.1093/jamia/ocz093.
3
The Value of Unstructured Electronic Health Record Data in Geriatric Syndrome Case Identification.非结构化电子健康记录数据在老年综合征病例识别中的价值。
J Am Geriatr Soc. 2018 Aug;66(8):1499-1507. doi: 10.1111/jgs.15411. Epub 2018 Jul 4.
4
Comparing clinician descriptions of frailty and geriatric syndromes using electronic health records: a retrospective cohort study.使用电子健康记录比较临床医生对虚弱和老年综合征的描述:一项回顾性队列研究。
BMC Geriatr. 2017 Oct 25;17(1):248. doi: 10.1186/s12877-017-0645-7.
5
Comparing information extraction techniques for low-prevalence concepts: The case of insulin rejection by patients.比较低患病率概念的信息提取技术:以患者拒绝胰岛素为例。
J Biomed Inform. 2019 Nov;99:103306. doi: 10.1016/j.jbi.2019.103306. Epub 2019 Oct 13.
6
Extraction of Information Related to Drug Safety Surveillance From Electronic Health Record Notes: Joint Modeling of Entities and Relations Using Knowledge-Aware Neural Attentive Models.从电子健康记录笔记中提取与药物安全监测相关的信息:使用知识感知神经注意力模型对实体和关系进行联合建模
JMIR Med Inform. 2020 Jul 10;8(7):e18417. doi: 10.2196/18417.
7
Clinical Named Entity Recognition From Chinese Electronic Health Records via Machine Learning Methods.基于机器学习方法的中文电子健康记录临床命名实体识别
JMIR Med Inform. 2018 Dec 17;6(4):e50. doi: 10.2196/medinform.9965.
8
Extracting Critical Information from Unstructured Clinicians' Notes Data to Identify Dementia Severity Using a Rule-Based Approach: Feasibility Study.基于规则的方法从非结构化临床医生笔记数据中提取关键信息以识别痴呆严重程度的可行性研究。
JMIR Aging. 2024 Sep 24;7:e57926. doi: 10.2196/57926.
9
A comparison of word embeddings for the biomedical natural language processing.生物医学自然语言处理中词嵌入的比较。
J Biomed Inform. 2018 Nov;87:12-20. doi: 10.1016/j.jbi.2018.09.008. Epub 2018 Sep 12.
10
Mining Clinical Notes for Physical Rehabilitation Exercise Information: Natural Language Processing Algorithm Development and Validation Study.挖掘临床记录中的物理康复锻炼信息:自然语言处理算法的开发与验证研究
JMIR Med Inform. 2024 Apr 3;12:e52289. doi: 10.2196/52289.

引用本文的文献

1
Improving Clinical Documentation with Artificial Intelligence: A Systematic Review.利用人工智能改善临床文档记录:一项系统综述。
Perspect Health Inf Manag. 2024 Jun 1;21(2):1d. eCollection 2024 Summer-Fall.
2
Health Care Language Models and Their Fine-Tuning for Information Extraction: Scoping Review.医疗保健语言模型及其在信息提取方面的微调:范围综述。
JMIR Med Inform. 2024 Oct 21;12:e60164. doi: 10.2196/60164.
3
The use of natural language processing for the identification of ageing syndromes including sarcopenia, frailty and falls in electronic healthcare records: a systematic review.

本文引用的文献

1
Public and Population Health Informatics: The Bridging of Big Data to Benefit Communities.公共卫生与人群健康信息学:连接大数据以造福社区。
Yearb Med Inform. 2018 Aug;27(1):199-206. doi: 10.1055/s-0038-1667081. Epub 2018 Aug 29.
2
Forecasting the Maturation of Electronic Health Record Functions Among US Hospitals: Retrospective Analysis and Predictive Model.预测美国医院电子健康记录功能的成熟度:回顾性分析与预测模型
J Med Internet Res. 2018 Aug 7;20(8):e10458. doi: 10.2196/10458.
3
Deep EHR: A Survey of Recent Advances in Deep Learning Techniques for Electronic Health Record (EHR) Analysis.
利用自然语言处理技术在电子医疗记录中识别包括肌肉减少症、虚弱和跌倒在内的老年综合征:系统评价。
Age Ageing. 2024 Jul 2;53(7). doi: 10.1093/ageing/afae135.
4
Natural language processing systems for extracting information from electronic health records about activities of daily living. A systematic review.用于从电子健康记录中提取日常生活活动信息的自然语言处理系统。一项系统综述。
JAMIA Open. 2024 May 24;7(2):ooae044. doi: 10.1093/jamiaopen/ooae044. eCollection 2024 Jul.
5
BERT-Based Neural Network for Inpatient Fall Detection From Electronic Medical Records: Retrospective Cohort Study.基于BERT的神经网络用于从电子病历中检测住院患者跌倒:回顾性队列研究
JMIR Med Inform. 2024 Jan 30;12:e48995. doi: 10.2196/48995.
6
Unsupervised natural language processing in the identification of patients with suspected COVID-19 infection.基于自然语言处理的 COVID-19 疑似患者识别。
Cad Saude Publica. 2023 Dec 4;39(11):e00243722. doi: 10.1590/0102-311XPT243722. eCollection 2023.
7
Social Determinants of Health Documentation in Structured and Unstructured Clinical Data of Patients With Diabetes: Comparative Analysis.糖尿病患者结构化和非结构化临床数据中的健康记录社会决定因素:比较分析
JMIR Med Inform. 2023 Aug 22;11:e46159. doi: 10.2196/46159.
8
Can Patients with Dementia Be Identified in Primary Care Electronic Medical Records Using Natural Language Processing?能否使用自然语言处理在初级保健电子病历中识别痴呆症患者?
J Healthc Inform Res. 2023 Jan 23;7(1):42-58. doi: 10.1007/s41666-023-00125-6. eCollection 2023 Mar.
9
A novel semiautomatic Chinese keywords instrument screening delirium based on electronic medical records.一种基于电子病历的新型半自动中文关键词工具筛查谵妄
BMC Geriatr. 2022 Oct 4;22(1):779. doi: 10.1186/s12877-022-03474-w.
10
Linking Free Text Documentation of Functioning and Disability to the ICF With Natural Language Processing.通过自然语言处理将功能与残疾的自由文本记录与《国际功能、残疾和健康分类》相联系。
Front Rehabil Sci. 2021 Nov;2. doi: 10.3389/fresc.2021.742702. Epub 2021 Nov 5.
深度电子健康记录(EHR):深度学习技术在电子健康记录(EHR)分析中的最新进展综述。
IEEE J Biomed Health Inform. 2018 Sep;22(5):1589-1604. doi: 10.1109/JBHI.2017.2767063. Epub 2017 Oct 27.
4
The Value of Unstructured Electronic Health Record Data in Geriatric Syndrome Case Identification.非结构化电子健康记录数据在老年综合征病例识别中的价值。
J Am Geriatr Soc. 2018 Aug;66(8):1499-1507. doi: 10.1111/jgs.15411. Epub 2018 Jul 4.
5
Assessing markers from ambulatory laboratory tests for predicting high-risk patients.评估动态实验室检测标志物预测高危患者。
Am J Manag Care. 2018 Jun 1;24(6):e190-e195.
6
Defining and Assessing Geriatric Risk Factors and Associated Health Care Utilization Among Older Adults Using Claims and Electronic Health Records.利用索赔和电子健康记录定义和评估老年人的老年病风险因素及其相关医疗保健利用情况。
Med Care. 2018 Mar;56(3):233-239. doi: 10.1097/MLR.0000000000000865.
7
A Practical Comparison Between the Predictive Power of Population-based Risk Stratification Models Using Data From Electronic Health Records Versus Administrative Claims: Setting a Baseline for Future EHR-derived Risk Stratification Models.使用电子健康记录数据与行政索赔数据的基于人群的风险分层模型预测能力的实际比较:为未来基于电子健康记录的风险分层模型设定基线
Med Care. 2018 Feb;56(2):202-203. doi: 10.1097/MLR.0000000000000849.
8
Clinical information extraction applications: A literature review.临床信息提取应用:文献综述。
J Biomed Inform. 2018 Jan;77:34-49. doi: 10.1016/j.jbi.2017.11.011. Epub 2017 Nov 21.
9
Comparing clinician descriptions of frailty and geriatric syndromes using electronic health records: a retrospective cohort study.使用电子健康记录比较临床医生对虚弱和老年综合征的描述:一项回顾性队列研究。
BMC Geriatr. 2017 Oct 25;17(1):248. doi: 10.1186/s12877-017-0645-7.
10
Evaluating the Impact of Prescription Fill Rates on Risk Stratification Model Performance.评估处方配药率对风险分层模型性能的影响。
Med Care. 2017 Dec;55(12):1052-1060. doi: 10.1097/MLR.0000000000000825.