调整现有自然语言处理资源以识别临床记录中的心血管危险因素。

Adapting existing natural language processing resources for cardiovascular risk factors identification in clinical notes.

作者信息

Khalifa Abdulrahman, Meystre Stéphane

机构信息

Department of Biomedical Informatics, University of Utah, Salt Lake City, UT, United States.

出版信息

J Biomed Inform. 2015 Dec;58 Suppl(Suppl):S128-S132. doi: 10.1016/j.jbi.2015.08.002. Epub 2015 Aug 28.

DOI:10.1016/j.jbi.2015.08.002

PMID:26318122

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4983192/

Abstract

The 2014 i2b2 natural language processing shared task focused on identifying cardiovascular risk factors such as high blood pressure, high cholesterol levels, obesity and smoking status among other factors found in health records of diabetic patients. In addition, the task involved detecting medications, and time information associated with the extracted data. This paper presents the development and evaluation of a natural language processing (NLP) application conceived for this i2b2 shared task. For increased efficiency, the application main components were adapted from two existing NLP tools implemented in the Apache UIMA framework: Textractor (for dictionary-based lookup) and cTAKES (for preprocessing and smoking status detection). The application achieved a final (micro-averaged) F1-measure of 87.5% on the final evaluation test set. Our attempt was mostly based on existing tools adapted with minimal changes and allowed for satisfying performance with limited development efforts.

摘要

2014年i2b2自然语言处理共享任务聚焦于识别心血管危险因素，如糖尿病患者健康记录中发现的高血压、高胆固醇水平、肥胖及吸烟状况等其他因素。此外，该任务还涉及检测药物以及与提取数据相关的时间信息。本文介绍了为该i2b2共享任务构思的自然语言处理（NLP）应用程序的开发与评估。为提高效率，应用程序的主要组件改编自Apache UIMA框架中实现的两个现有NLP工具：Textractor（用于基于字典的查找）和cTAKES（用于预处理和吸烟状况检测）。该应用程序在最终评估测试集上的最终（微平均）F1值为87.5%。我们的尝试主要基于对现有工具进行最少更改的改编，并通过有限的开发工作实现了令人满意的性能。

相似文献

Adapting existing natural language processing resources for cardiovascular risk factors identification in clinical notes.调整现有自然语言处理资源以识别临床记录中的心血管危险因素。

J Biomed Inform. 2015 Dec;58 Suppl(Suppl):S128-S132. doi: 10.1016/j.jbi.2015.08.002. Epub 2015 Aug 28.

Combining glass box and black box evaluations in the identification of heart disease risk factors and their temporal relations from clinical records.结合玻璃盒和黑盒评估方法从临床记录中识别心脏病风险因素及其时间关系。

J Biomed Inform. 2015 Dec;58 Suppl(Suppl):S133-S142. doi: 10.1016/j.jbi.2015.06.014. Epub 2015 Jul 2.

Risk factor detection for heart disease by applying text analytics in electronic medical records.通过在电子病历中应用文本分析进行心脏病风险因素检测。

J Biomed Inform. 2015 Dec;58 Suppl(Suppl):S164-S170. doi: 10.1016/j.jbi.2015.08.011. Epub 2015 Aug 14.

Using local lexicalized rules to identify heart disease risk factors in clinical notes.使用局部词汇化规则识别临床记录中的心脏病风险因素。

J Biomed Inform. 2015 Dec;58 Suppl(Suppl):S183-S188. doi: 10.1016/j.jbi.2015.06.013. Epub 2015 Jun 29.

An automatic system to identify heart disease risk factors in clinical texts over time.一个用于长期识别临床文本中心脏病风险因素的自动系统。

J Biomed Inform. 2015 Dec;58 Suppl(Suppl):S158-S163. doi: 10.1016/j.jbi.2015.09.002. Epub 2015 Sep 8.

Mining heart disease risk factors in clinical text with named entity recognition and distributional semantic models.利用命名实体识别和分布语义模型挖掘临床文本中的心脏病风险因素。

J Biomed Inform. 2015 Dec;58 Suppl(Suppl):S143-S149. doi: 10.1016/j.jbi.2015.08.009. Epub 2015 Aug 21.

Agile text mining for the 2014 i2b2/UTHealth Cardiac risk factors challenge.面向2014年i2b2/德克萨斯大学健康科学中心心脏危险因素挑战赛的敏捷文本挖掘

J Biomed Inform. 2015 Dec;58 Suppl(0):S120-S127. doi: 10.1016/j.jbi.2015.06.030. Epub 2015 Jul 22.

A context-aware approach for progression tracking of medical concepts in electronic medical records.一种用于电子病历中医学概念进展跟踪的上下文感知方法。

J Biomed Inform. 2015 Dec;58 Suppl(Suppl):S150-S157. doi: 10.1016/j.jbi.2015.09.013. Epub 2015 Sep 30.

Identifying risk factors for heart disease over time: Overview of 2014 i2b2/UTHealth shared task Track 2.随着时间推移识别心脏病的风险因素：2014年i2b2/德克萨斯大学健康科学中心共享任务第2轨道概述

J Biomed Inform. 2015 Dec;58 Suppl(Suppl):S67-S77. doi: 10.1016/j.jbi.2015.07.001. Epub 2015 Jul 22.

Coronary artery disease risk assessment from unstructured electronic health records using text mining.利用文本挖掘技术从非结构化电子健康记录中进行冠状动脉疾病风险评估。

J Biomed Inform. 2015 Dec;58 Suppl(Suppl):S203-S210. doi: 10.1016/j.jbi.2015.08.003. Epub 2015 Aug 28.

引用本文的文献

Machine Learning and Natural Language Processing to Improve Classification of Atrial Septal Defects in Electronic Health Records.利用机器学习和自然语言处理技术改善电子健康记录中房间隔缺损的分类

Birth Defects Res. 2025 Mar;117(3):e2451. doi: 10.1002/bdr2.2451.

Clinical concept annotation with contextual word embedding in active transfer learning environment.主动迁移学习环境下基于上下文词嵌入的临床概念标注

Digit Health. 2024 Dec 19;10:20552076241308987. doi: 10.1177/20552076241308987. eCollection 2024 Jan-Dec.

Multimodal Data Hybrid Fusion and Natural Language Processing for Clinical Prediction Models.用于临床预测模型的多模态数据混合融合与自然语言处理

AMIA Jt Summits Transl Sci Proc. 2024 May 31;2024:191-200. eCollection 2024.

The validity of electronic health data for measuring smoking status: a systematic review and meta-analysis.电子健康数据测量吸烟状况的有效性：系统评价和荟萃分析。

BMC Med Inform Decis Mak. 2024 Feb 2;24(1):33. doi: 10.1186/s12911-024-02416-3.

Development and Evaluation of a Natural Language Processing System for Curating a Trans-Thoracic Echocardiogram (TTE) Database.用于整理经胸超声心动图（TTE）数据库的自然语言处理系统的开发与评估

Bioengineering (Basel). 2023 Nov 10;10(11):1307. doi: 10.3390/bioengineering10111307.

Supervised Text Classification System Detects Fontan Patients in Electronic Records With Higher Accuracy Than Codes.监督式文本分类系统在电子病历中的 Fontan 患者检测准确率高于编码。

J Am Heart Assoc. 2023 Jul 4;12(13):e030046. doi: 10.1161/JAHA.123.030046. Epub 2023 Jun 22.

Heart disease risk factors detection from electronic health records using advanced NLP and deep learning techniques.利用先进的自然语言处理和深度学习技术从电子健康记录中检测心脏病风险因素。

Sci Rep. 2023 May 3;13(1):7173. doi: 10.1038/s41598-023-34294-6.

Analysis of 'One in a Million' primary care consultation conversations using natural language processing.运用自然语言处理技术分析“百万分之一”的基层医疗咨询对话。

BMJ Health Care Inform. 2023 Apr;30(1). doi: 10.1136/bmjhci-2022-100659.

Development and external validation of a machine learning-based prediction model for the cancer-related fatigue diagnostic screening in adult cancer patients: a cross-sectional study in China.基于机器学习的癌症相关疲劳诊断筛选在成年癌症患者中的预测模型的开发和外部验证：中国的一项横断面研究。

Support Care Cancer. 2023 Jan 10;31(2):106. doi: 10.1007/s00520-022-07570-w.

Developing Automated Computer Algorithms to Phenotype Periodontal Disease Diagnoses in Electronic Dental Records.开发自动计算机算法以在电子牙科记录中对牙周病诊断进行表型分析。

Methods Inf Med. 2022 Dec;61(S 02):e125-e133. doi: 10.1055/s-0042-1757880. Epub 2022 Nov 22.

本文引用的文献

J Biomed Inform. 2015 Dec;58 Suppl(Suppl):S133-S142. doi: 10.1016/j.jbi.2015.06.014. Epub 2015 Jul 2.

Annotating risk factors for heart disease in clinical narratives for diabetic patients.在糖尿病患者的临床记录中注释心脏病的危险因素。

J Biomed Inform. 2015 Dec;58 Suppl(Suppl):S78-S91. doi: 10.1016/j.jbi.2015.05.009. Epub 2015 May 21.

Evaluating temporal relations in clinical text: 2012 i2b2 Challenge.评估临床文本中的时间关系：2012 i2b2 挑战赛。

J Am Med Inform Assoc. 2013 Sep-Oct;20(5):806-13. doi: 10.1136/amiajnl-2013-001628. Epub 2013 Apr 5.

Evaluating the state of the art in coreference resolution for electronic medical records.评估电子病历中核心参考解析的最新技术水平。

J Am Med Inform Assoc. 2012 Sep-Oct;19(5):786-91. doi: 10.1136/amiajnl-2011-000784. Epub 2012 Feb 24.

Natural language processing: an introduction.自然语言处理：入门。

J Am Med Inform Assoc. 2011 Sep-Oct;18(5):544-51. doi: 10.1136/amiajnl-2011-000464.

Overcoming barriers to NLP for clinical text: the role of shared tasks and the need for additional creative solutions.克服临床文本自然语言处理的障碍：共享任务的作用及对其他创造性解决方案的需求。

J Am Med Inform Assoc. 2011 Sep-Oct;18(5):540-3. doi: 10.1136/amiajnl-2011-000465.

2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text.2010 i2b2/VA 挑战赛：临床文本中的概念、断言和关系

J Am Med Inform Assoc. 2011 Sep-Oct;18(5):552-6. doi: 10.1136/amiajnl-2011-000203. Epub 2011 Jun 16.

Textractor: a hybrid system for medications and reason for their prescription extraction from clinical text documents.Textractor：一种混合系统，用于从临床文本文档中提取药物和其处方的原因。

J Am Med Inform Assoc. 2010 Sep-Oct;17(5):559-62. doi: 10.1136/jamia.2010.004028.

Extracting medication information from clinical text.从临床文本中提取药物信息。

J Am Med Inform Assoc. 2010 Sep-Oct;17(5):514-8. doi: 10.1136/jamia.2010.003947.

Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications.梅奥临床文本分析和知识提取系统（cTAKES）：架构、组件评估和应用。

J Am Med Inform Assoc. 2010 Sep-Oct;17(5):507-13. doi: 10.1136/jamia.2009.001560.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。