• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

从电子病历中提取数值数据:一种高效且可推广的扩展临床研究的工具。

EXTraction of EMR numerical data: an efficient and generalizable tool to EXTEND clinical research.

机构信息

Division of Rheumatology, Immunology, and Allergy, Brigham and Women's Hospital, Boston, MA, 6016BB, 60 Fenwood Road, Boston, 02115, USA.

Harvard Medical School, Boston, MA, USA.

出版信息

BMC Med Inform Decis Mak. 2019 Nov 15;19(1):226. doi: 10.1186/s12911-019-0970-1.

DOI:10.1186/s12911-019-0970-1
PMID:31730484
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6858776/
Abstract

BACKGROUND

Electronic medical records (EMR) contain numerical data important for clinical outcomes research, such as vital signs and cardiac ejection fractions (EF), which tend to be embedded in narrative clinical notes. In current practice, this data is often manually extracted for use in research studies. However, due to the large volume of notes in datasets, manually extracting numerical data often becomes infeasible. The objective of this study is to develop and validate a natural language processing (NLP) tool that can efficiently extract numerical clinical data from narrative notes.

RESULTS

To validate the accuracy of the tool EXTraction of EMR Numerical Data (EXTEND), we developed a reference standard by manually extracting vital signs from 285 notes, EF values from 300 notes, glycated hemoglobin (HbA1C), and serum creatinine from 890 notes. For each parameter of interest, we calculated the sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV) and F score of EXTEND using two metrics. (1) completion of data extraction, and (2) accuracy of data extraction compared to the actual values in the note verified by chart review. At the note level, extraction by EXTEND was considered correct only if it accurately detected and extracted all values of interest in a note. Using manually-annotated labels as the gold standard, the note-level accuracy of EXTEND in capturing the numerical vital sign values, EF, HbA1C and creatinine ranged from 0.88 to 0.95 for sensitivity, 0.95 to 1.0 for specificity, 0.95 to 1.0 for PPV, 0.89 to 0.99 for NPV, and 0.92 to 0.96 in F scores. Compared to the actual value level, the sensitivity, PPV, and F score of EXTEND ranged from 0.91 to 0.95, 0.95 to 1.0 and 0.95 to 0.96.

CONCLUSIONS

EXTEND is an efficient, flexible tool that uses knowledge-based rules to extract clinical numerical parameters with high accuracy. By increasing dictionary terms and developing new rules, the usage of EXTEND can easily be expanded to extract additional numerical data important in clinical outcomes research.

摘要

背景

电子病历(EMR)包含重要的临床结果研究数值数据,例如生命体征和心脏射血分数(EF),这些数据通常嵌入在临床记录的叙述中。在当前实践中,这些数据通常需要手动提取以供研究使用。然而,由于数据集中文本量很大,手动提取数值数据通常变得不可行。本研究的目的是开发和验证一种能够从叙述性文本中高效提取临床数值数据的自然语言处理(NLP)工具。

结果

为了验证工具 EXTraction of EMR Numerical Data(EXTEND)的准确性,我们通过手动从 285 份记录中提取生命体征,从 300 份记录中提取 EF 值,从 890 份记录中提取糖化血红蛋白(HbA1C)和血清肌酐,开发了一个参考标准。对于每个感兴趣的参数,我们使用两种指标计算了 EXTEND 的灵敏度、特异性、阳性预测值(PPV)、阴性预测值(NPV)和 F 分数。(1)数据提取的完成情况,(2)与通过图表审查验证的记录中实际值相比的数据提取准确性。在记录级别,如果 EXTEND 准确地检测并提取了记录中所有感兴趣的值,则认为提取是正确的。使用手动标注的标签作为金标准,EXTEND 在捕捉数值生命体征值、EF、HbA1C 和肌酐的记录级别的准确性为 0.88 到 0.95 的灵敏度,0.95 到 1.0 的特异性,0.95 到 1.0 的 PPV,0.89 到 0.99 的 NPV 和 0.92 到 0.96 的 F 分数。与实际值级别相比,EXTEND 的灵敏度、PPV 和 F 分数范围为 0.91 到 0.95、0.95 到 1.0 和 0.95 到 0.96。

结论

EXTEND 是一种高效、灵活的工具,它使用基于知识的规则以高精度提取临床数值参数。通过增加字典术语和开发新规则,EXTEND 的使用可以很容易地扩展到提取临床结果研究中重要的其他数值数据。

相似文献

1
EXTraction of EMR numerical data: an efficient and generalizable tool to EXTEND clinical research.从电子病历中提取数值数据:一种高效且可推广的扩展临床研究的工具。
BMC Med Inform Decis Mak. 2019 Nov 15;19(1):226. doi: 10.1186/s12911-019-0970-1.
2
Extraction of sleep information from clinical notes of Alzheimer's disease patients using natural language processing.使用自然语言处理从阿尔茨海默病患者的临床记录中提取睡眠信息。
J Am Med Inform Assoc. 2024 Oct 1;31(10):2217-2227. doi: 10.1093/jamia/ocae177.
3
Medication Extraction from Electronic Clinical Notes in an Integrated Health System: A Study on Aspirin Use in Patients with Nonvalvular Atrial Fibrillation.综合医疗系统中电子临床记录的药物提取:非瓣膜性心房颤动患者阿司匹林使用情况的研究
Clin Ther. 2015 Sep;37(9):2048-2058.e2. doi: 10.1016/j.clinthera.2015.07.002. Epub 2015 Jul 29.
4
Natural language processing of radiology reports for identification of skeletal site-specific fractures.放射科报告的自然语言处理以识别骨骼部位特异性骨折。
BMC Med Inform Decis Mak. 2019 Apr 4;19(Suppl 3):73. doi: 10.1186/s12911-019-0780-5.
5
Data for registry and quality review can be retrospectively collected using natural language processing from unstructured charts of arthroplasty patients.可以使用自然语言处理从关节置换患者的非结构化图表中回顾性地收集注册和质量审查数据。
Bone Joint J. 2020 Jul;102-B(7_Supple_B):99-104. doi: 10.1302/0301-620X.102B7.BJJ-2019-1574.R1.
6
Extracting data from electronic medical records: validation of a natural language processing program to assess prostate biopsy results.从电子病历中提取数据:评估前列腺活检结果的自然语言处理程序的验证
World J Urol. 2014 Feb;32(1):99-103. doi: 10.1007/s00345-013-1040-4. Epub 2013 Feb 17.
7
[A customized method for information extraction from unstructured text data in the electronic medical records].[一种从电子病历非结构化文本数据中提取信息的定制方法]
Beijing Da Xue Xue Bao Yi Xue Ban. 2018 Apr 18;50(2):256-263.
8
Development and validation of method for defining conditions using Chinese electronic medical record.利用中国电子病历定义条件的方法的开发与验证
BMC Med Inform Decis Mak. 2016 Aug 20;16:110. doi: 10.1186/s12911-016-0348-6.
9
Use of Natural Language Processing Algorithms to Identify Common Data Elements in Operative Notes for Knee Arthroplasty.使用自然语言处理算法识别膝关节置换手术记录中的常见数据元素。
J Arthroplasty. 2021 Mar;36(3):922-926. doi: 10.1016/j.arth.2020.09.029. Epub 2020 Oct 10.
10
Natural Language Processing of Clinical Notes to Identify Mental Illness and Substance Use Among People Living with HIV: Retrospective Cohort Study.利用临床记录的自然语言处理技术识别HIV感染者中的精神疾病和药物使用情况:回顾性队列研究
JMIR Med Inform. 2021 Mar 10;9(3):e23456. doi: 10.2196/23456.

引用本文的文献

1
Increasing the Impact and Value of Laboratory Medicine Through Effective and AI-Assisted Communication.通过有效且人工智能辅助的沟通提升检验医学的影响力和价值。
EJIFCC. 2025 Feb 28;36(1):12-25. eCollection 2025 Mar.
2
A series of natural language processing for predicting tumor response evaluation and survival curve from electronic health records.一系列用于从电子健康记录预测肿瘤反应评估和生存曲线的自然语言处理。
BMC Med Inform Decis Mak. 2025 Feb 17;25(1):85. doi: 10.1186/s12911-025-02928-6.
3
Risk of Incident Heart Failure and Heart Failure Subtypes in Patients With Rheumatoid Arthritis.

本文引用的文献

1
CT pulmonary angiography-based scoring system to predict the prognosis of acute pulmonary embolism.基于CT肺血管造影的评分系统预测急性肺栓塞的预后。
J Cardiovasc Comput Tomogr. 2016 Nov-Dec;10(6):473-479. doi: 10.1016/j.jcct.2016.08.007. Epub 2016 Aug 24.
2
Extracting and analyzing ejection fraction values from electronic echocardiography reports in a large health maintenance organization.从大型健康维护组织的电子超声心动图报告中提取和分析射血分数值。
Health Informatics J. 2017 Dec;23(4):319-328. doi: 10.1177/1460458216651917. Epub 2016 Jun 7.
3
A Natural Language Processing Tool for Large-Scale Data Extraction from Echocardiography Reports.
类风湿关节炎患者发生心力衰竭及心力衰竭亚型的风险
Arthritis Care Res (Hoboken). 2025 May;77(5):631-639. doi: 10.1002/acr.25481. Epub 2025 Jan 16.
4
A pragmatic methodology to extract anesthetic and physiological data from the electronic health record.一种从电子健康记录中提取麻醉和生理数据的实用方法。
Paediatr Anaesth. 2024 Apr;34(4):318-323. doi: 10.1111/pan.14817. Epub 2023 Dec 6.
5
Effectiveness of Medication Reconciliation in a Chinese Hospital: A Pilot Randomized Controlled Trial.中国一家医院的用药核对有效性:一项试点随机对照试验
J Multidiscip Healthc. 2023 Nov 24;16:3641-3650. doi: 10.2147/JMDH.S432522. eCollection 2023.
6
Extracting laboratory test information from paper-based reports.从纸质报告中提取实验室检测信息。
BMC Med Inform Decis Mak. 2023 Nov 6;23(1):251. doi: 10.1186/s12911-023-02346-6.
7
Generate Analysis-Ready Data for Real-world Evidence: Tutorial for Harnessing Electronic Health Records With Advanced Informatic Technologies.为真实世界证据生成可分析数据:利用先进信息学技术驾驭电子健康记录的教程。
J Med Internet Res. 2023 May 25;25:e45662. doi: 10.2196/45662.
8
Detecting of a Patient's Condition From Clinical Narratives Using Natural Language Representation.使用自然语言表示从临床叙述中检测患者病情
IEEE Open J Eng Med Biol. 2022 Sep 26;3:142-149. doi: 10.1109/OJEMB.2022.3209900. eCollection 2022.
9
One Clinician Is All You Need-Cardiac Magnetic Resonance Imaging Measurement Extraction: Deep Learning Algorithm Development.你所需的仅一位临床医生——心脏磁共振成像测量提取:深度学习算法开发
JMIR Med Inform. 2022 Sep 16;10(9):e38178. doi: 10.2196/38178.
10
Temporal Trends in Clinical Evidence of 5-Year Survival Within Electronic Health Records Among Patients With Early-Stage Colon Cancer Managed With Laparoscopy-Assisted Colectomy vs Open Colectomy.腹腔镜辅助结直肠切除术与开腹结直肠切除术治疗早期结肠癌患者的电子健康记录中 5 年生存率的临床证据的时间趋势。
JAMA Netw Open. 2022 Jun 1;5(6):e2218371. doi: 10.1001/jamanetworkopen.2022.18371.
一种用于从超声心动图报告中大规模提取数据的自然语言处理工具。
PLoS One. 2016 Apr 28;11(4):e0153749. doi: 10.1371/journal.pone.0153749. eCollection 2016.
4
Natural Language Processing Technologies in Radiology Research and Clinical Applications.放射学研究与临床应用中的自然语言处理技术
Radiographics. 2016 Jan-Feb;36(1):176-91. doi: 10.1148/rg.2016150080.
5
Automated extraction of ejection fraction for quality measurement using regular expressions in Unstructured Information Management Architecture (UIMA) for heart failure.使用 Unstructured Information Management Architecture (UIMA) 中的正则表达式自动提取射血分数,用于心力衰竭的质量测量。
J Am Med Inform Assoc. 2012 Sep-Oct;19(5):859-66. doi: 10.1136/amiajnl-2011-000535. Epub 2012 Mar 21.
6
Using machine learning for concept extraction on clinical documents from multiple data sources.利用机器学习从多个数据源的临床文档中提取概念。
J Am Med Inform Assoc. 2011 Sep-Oct;18(5):580-7. doi: 10.1136/amiajnl-2011-000155. Epub 2011 Jun 27.
7
Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications.梅奥临床文本分析和知识提取系统(cTAKES):架构、组件评估和应用。
J Am Med Inform Assoc. 2010 Sep-Oct;17(5):507-13. doi: 10.1136/jamia.2009.001560.
8
An overview of MetaMap: historical perspective and recent advances.MetaMap 概述:历史视角与最新进展。
J Am Med Inform Assoc. 2010 May-Jun;17(3):229-36. doi: 10.1136/jamia.2009.002733.
9
MedEx: a medication information extraction system for clinical narratives.MedEx:一个用于临床叙述的药物信息提取系统。
J Am Med Inform Assoc. 2010 Jan-Feb;17(1):19-24. doi: 10.1197/jamia.M3378.
10
A general natural-language text processor for clinical radiology.一种用于临床放射学的通用自然语言文本处理器。
J Am Med Inform Assoc. 1994 Mar-Apr;1(2):161-74. doi: 10.1136/jamia.1994.95236146.