• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

开发和评估一种自然语言处理标注工具以促进电子健康记录中认知状态的表型分析:诊断研究。

Development and Evaluation of a Natural Language Processing Annotation Tool to Facilitate Phenotyping of Cognitive Status in Electronic Health Records: Diagnostic Study.

机构信息

Department of Neurology, Massachusetts General Hospital, Boston, MA, United States.

Harvard Medical School, Boston, MA, United States.

出版信息

J Med Internet Res. 2022 Aug 30;24(8):e40384. doi: 10.2196/40384.

DOI:10.2196/40384
PMID:36040790
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9472045/
Abstract

BACKGROUND

Electronic health records (EHRs) with large sample sizes and rich information offer great potential for dementia research, but current methods of phenotyping cognitive status are not scalable.

OBJECTIVE

The aim of this study was to evaluate whether natural language processing (NLP)-powered semiautomated annotation can improve the speed and interrater reliability of chart reviews for phenotyping cognitive status.

METHODS

In this diagnostic study, we developed and evaluated a semiautomated NLP-powered annotation tool (NAT) to facilitate phenotyping of cognitive status. Clinical experts adjudicated the cognitive status of 627 patients at Mass General Brigham (MGB) health care, using NAT or traditional chart reviews. Patient charts contained EHR data from two data sets: (1) records from January 1, 2017, to December 31, 2018, for 100 Medicare beneficiaries from the MGB Accountable Care Organization and (2) records from 2 years prior to COVID-19 diagnosis to the date of COVID-19 diagnosis for 527 MGB patients. All EHR data from the relevant period were extracted; diagnosis codes, medications, and laboratory test values were processed and summarized; clinical notes were processed through an NLP pipeline; and a web tool was developed to present an integrated view of all data. Cognitive status was rated as cognitively normal, cognitively impaired, or undetermined. Assessment time and interrater agreement of NAT compared to manual chart reviews for cognitive status phenotyping was evaluated.

RESULTS

NAT adjudication provided higher interrater agreement (Cohen κ=0.89 vs κ=0.80) and significant speed up (time difference mean 1.4, SD 1.3 minutes; P<.001; ratio median 2.2, min-max 0.4-20) over manual chart reviews. There was moderate agreement with manual chart reviews (Cohen κ=0.67). In the cases that exhibited disagreement with manual chart reviews, NAT adjudication was able to produce assessments that had broader clinical consensus due to its integrated view of highlighted relevant information and semiautomated NLP features.

CONCLUSIONS

NAT adjudication improves the speed and interrater reliability for phenotyping cognitive status compared to manual chart reviews. This study underscores the potential of an NLP-based clinically adjudicated method to build large-scale dementia research cohorts from EHRs.

摘要

背景

电子健康记录 (EHR) 具有较大的样本量和丰富的信息,为痴呆症研究提供了巨大的潜力,但目前用于表型认知状态的方法不可扩展。

目的

本研究旨在评估自然语言处理 (NLP) 支持的半自动注释是否可以提高图表审查表型认知状态的速度和组内一致性。

方法

在这项诊断研究中,我们开发并评估了一种半自动 NLP 支持的注释工具 (NAT),以促进认知状态的表型。临床专家使用 NAT 或传统图表审查对马萨诸塞州综合医院 (MGB) 医疗保健的 627 名患者的认知状态进行了裁决。患者病历包含来自两个数据集的 EHR 数据:(1) MGB 责任制医疗组织 100 名医疗保险受益人的 2017 年 1 月 1 日至 2018 年 12 月 31 日记录,以及 (2) COVID-19 诊断前 2 年至 COVID-19 诊断日期的 527 名 MGB 患者的记录。提取了相关期间的所有 EHR 数据;处理和总结了诊断代码、药物和实验室测试值;通过 NLP 管道处理临床记录;并开发了一个网络工具来呈现所有数据的综合视图。认知状态被评为认知正常、认知障碍或不确定。评估了 NAT 与手动图表审查在认知状态表型评估方面的评估时间和组内一致性。

结果

NAT 裁决提供了更高的组内一致性 (Cohen κ=0.89 对 κ=0.80),并且与手动图表审查相比,速度显著提高 (平均时间差 1.4 分钟,SD 1.3 分钟;P<.001;中位数比 2.2,最小值-最大值 0.4-20)。与手动图表审查具有中度一致性 (Cohen κ=0.67)。在与手动图表审查不一致的情况下,NAT 裁决能够通过其对突出相关信息的综合视图和半自动 NLP 功能产生具有更广泛临床共识的评估。

结论

与手动图表审查相比,NAT 裁决提高了表型认知状态的速度和组内一致性。这项研究强调了基于 NLP 的临床裁决方法在从 EHR 构建大型痴呆症研究队列方面的潜力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/194a/9472045/705802945e06/jmir_v24i8e40384_fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/194a/9472045/fe25a61a0d36/jmir_v24i8e40384_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/194a/9472045/c867cb2d5ed1/jmir_v24i8e40384_fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/194a/9472045/b13c75088a28/jmir_v24i8e40384_fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/194a/9472045/705802945e06/jmir_v24i8e40384_fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/194a/9472045/fe25a61a0d36/jmir_v24i8e40384_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/194a/9472045/c867cb2d5ed1/jmir_v24i8e40384_fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/194a/9472045/b13c75088a28/jmir_v24i8e40384_fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/194a/9472045/705802945e06/jmir_v24i8e40384_fig4.jpg

相似文献

1
Development and Evaluation of a Natural Language Processing Annotation Tool to Facilitate Phenotyping of Cognitive Status in Electronic Health Records: Diagnostic Study.开发和评估一种自然语言处理标注工具以促进电子健康记录中认知状态的表型分析:诊断研究。
J Med Internet Res. 2022 Aug 30;24(8):e40384. doi: 10.2196/40384.
2
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
3
Artificial intelligence approaches for phenotyping heart failure in U.S. Veterans Health Administration electronic health record.美国退伍军人事务部电子健康记录中基于人工智能的心力衰竭表型分析方法。
ESC Heart Fail. 2024 Oct;11(5):3155-3166. doi: 10.1002/ehf2.14787. Epub 2024 Jun 14.
4
Using natural language processing to identify problem usage of prescription opioids.使用自然语言处理来识别处方阿片类药物的问题使用情况。
Int J Med Inform. 2015 Dec;84(12):1057-64. doi: 10.1016/j.ijmedinf.2015.09.002. Epub 2015 Sep 25.
5
Automated chart review utilizing natural language processing algorithm for asthma predictive index.利用自然语言处理算法进行自动化图表审查,以预测哮喘指数。
BMC Pulm Med. 2018 Feb 13;18(1):34. doi: 10.1186/s12890-018-0593-9.
6
Scalable relevance ranking algorithm via semantic similarity assessment improves efficiency of medical chart review.通过语义相似性评估的可扩展相关性排序算法提高了医学图表审查的效率。
J Biomed Inform. 2022 Aug;132:104109. doi: 10.1016/j.jbi.2022.104109. Epub 2022 Jun 1.
7
The use of natural language processing to identify vaccine-related anaphylaxis at five health care systems in the Vaccine Safety Datalink.利用自然语言处理技术在疫苗安全数据链中的五个医疗系统中识别与疫苗相关的过敏反应。
Pharmacoepidemiol Drug Saf. 2020 Feb;29(2):182-188. doi: 10.1002/pds.4919. Epub 2019 Dec 3.
8
Using natural language processing to improve efficiency of manual chart abstraction in research: the case of breast cancer recurrence.利用自然语言处理提高研究中手动图表提取的效率:以乳腺癌复发为例。
Am J Epidemiol. 2014 Mar 15;179(6):749-58. doi: 10.1093/aje/kwt441. Epub 2014 Jan 30.
9
Identifying Cases of Shoulder Injury Related to Vaccine Administration (SIRVA) in the United States: Development and Validation of a Natural Language Processing Method.美国疫苗接种相关肩部损伤(SIRVA)病例的识别:自然语言处理方法的开发和验证。
JMIR Public Health Surveill. 2022 May 24;8(5):e30426. doi: 10.2196/30426.
10
Ascertainment of Delirium Status Using Natural Language Processing From Electronic Health Records.使用电子健康记录中的自然语言处理来确定谵妄状态。
J Gerontol A Biol Sci Med Sci. 2022 Mar 3;77(3):524-530. doi: 10.1093/gerona/glaa275.

引用本文的文献

1
CARE-AD: a multi-agent large language model framework for Alzheimer's disease prediction using longitudinal clinical notes.CARE-AD:一个使用纵向临床记录进行阿尔茨海默病预测的多智能体大语言模型框架。
NPJ Digit Med. 2025 Aug 24;8(1):541. doi: 10.1038/s41746-025-01940-4.
2
A GPT-4o-powered framework for identifying cognitive impairment stages in electronic health records.一种用于在电子健康记录中识别认知障碍阶段的由GPT-4o驱动的框架。
NPJ Digit Med. 2025 Jul 3;8(1):401. doi: 10.1038/s41746-025-01834-5.
3
Investigating Primary Care Indications to Improve the Quality of Electronic Health Record Data in Target Trial Emulation for Dementia.

本文引用的文献

1
Identifying Medicare beneficiaries with dementia.识别患有痴呆症的医疗保险受益人群。
J Am Geriatr Soc. 2021 Aug;69(8):2240-2251. doi: 10.1111/jgs.17183. Epub 2021 Apr 26.
2
2021 Alzheimer's disease facts and figures.2021 年阿尔茨海默病事实和数据。
Alzheimers Dement. 2021 Mar;17(3):327-406. doi: 10.1002/alz.12328. Epub 2021 Mar 23.
3
Time to reality check the promises of machine learning-powered precision medicine.是时候对机器学习驱动的精准医学的承诺进行现实检验了。
在痴呆症目标试验模拟中研究初级保健指标以提高电子健康记录数据质量
medRxiv. 2025 Apr 10:2025.04.08.25325485. doi: 10.1101/2025.04.08.25325485.
4
Extracting Cognitive Impairment Assessment Information From Unstructured Notes in Electronic Health Records Using Natural Language Processing Tools: Validation with Clinical Assessment Data.使用自然语言处理工具从电子健康记录中的非结构化笔记中提取认知障碍评估信息:与临床评估数据的验证
Clin Epidemiol. 2025 Apr 15;17:353-365. doi: 10.2147/CLEP.S504259. eCollection 2025.
5
Natural language processing of electronic health records for early detection of cognitive decline: a systematic review.用于早期检测认知衰退的电子健康记录自然语言处理:一项系统综述
NPJ Digit Med. 2025 Mar 1;8(1):133. doi: 10.1038/s41746-025-01527-z.
6
Real-World Insights Into Dementia Diagnosis Trajectory and Clinical Practice Patterns Unveiled by Natural Language Processing: Development and Usability Study.自然语言处理揭示的痴呆症诊断轨迹和临床实践模式的真实世界见解:开发与可用性研究
JMIR Aging. 2025 Feb 25;8:e65221. doi: 10.2196/65221.
7
Hybrid natural language processing tool for semantic annotation of medical texts in Spanish.用于西班牙语医学文本语义标注的混合自然语言处理工具。
BMC Bioinformatics. 2025 Jan 8;26(1):7. doi: 10.1186/s12859-024-05949-6.
8
Medical Marijuana Documentation Practices in Patient Electronic Health Records: Retrospective Observational Study Using Smart Data Elements and a Review of Medical Records.患者电子健康记录中的医用大麻记录实践:使用智能数据元素的回顾性观察研究及病历审查
JMIR Form Res. 2024 Dec 23;8:e65957. doi: 10.2196/65957.
9
Extracting Critical Information from Unstructured Clinicians' Notes Data to Identify Dementia Severity Using a Rule-Based Approach: Feasibility Study.基于规则的方法从非结构化临床医生笔记数据中提取关键信息以识别痴呆严重程度的可行性研究。
JMIR Aging. 2024 Sep 24;7:e57926. doi: 10.2196/57926.
10
Automated Medical Records Review for Mild Cognitive Impairment and Dementia.轻度认知障碍和痴呆的自动化医疗记录审查
Res Sq. 2024 Nov 6:rs.3.rs-5046441. doi: 10.21203/rs.3.rs-5046441/v1.
Lancet Digit Health. 2020 Dec;2(12):e677-e680. doi: 10.1016/S2589-7500(20)30200-4. Epub 2020 Sep 16.
4
A Review of Challenges and Opportunities in Machine Learning for Health.机器学习在健康领域的挑战与机遇综述。
AMIA Jt Summits Transl Sci Proc. 2020 May 30;2020:191-200. eCollection 2020.
5
Protected Health Information filter (Philter): accurately and securely de-identifying free-text clinical notes.受保护的健康信息过滤器(Philter):准确且安全地去除自由文本临床记录中的身份标识信息。
NPJ Digit Med. 2020 Apr 14;3:57. doi: 10.1038/s41746-020-0258-y. eCollection 2020.
6
An overview of clinical decision support systems: benefits, risks, and strategies for success.临床决策支持系统概述:益处、风险及成功策略。
NPJ Digit Med. 2020 Feb 6;3:17. doi: 10.1038/s41746-020-0221-y. eCollection 2020.
7
Artificial intelligence approaches using natural language processing to advance EHR-based clinical research.利用自然语言处理技术的人工智能方法来推进基于电子健康记录的临床研究。
J Allergy Clin Immunol. 2020 Feb;145(2):463-469. doi: 10.1016/j.jaci.2019.12.897. Epub 2019 Dec 26.
8
High-throughput phenotyping with electronic medical record data using a common semi-supervised approach (PheCAP).使用一种常见的半监督方法(PheCAP)对电子病历数据进行高通量表型分析。
Nat Protoc. 2019 Dec;14(12):3426-3444. doi: 10.1038/s41596-019-0227-6. Epub 2019 Nov 20.
9
Development and Validation of eRADAR: A Tool Using EHR Data to Detect Unrecognized Dementia.eRADAR 的开发与验证:一种利用电子健康记录数据检测未被识别的痴呆症的工具。
J Am Geriatr Soc. 2020 Jan;68(1):103-111. doi: 10.1111/jgs.16182. Epub 2019 Oct 14.
10
Analysis of dementia in the US population using Medicare claims: Insights from linked survey and administrative claims data.利用医疗保险理赔数据对美国人群中的痴呆症进行分析:来自关联调查和行政理赔数据的见解。
Alzheimers Dement (N Y). 2019 Jun 6;5:197-207. doi: 10.1016/j.trci.2019.04.003. eCollection 2019.