挖掘临床记录中与跌倒相关的信息：基于规则和基于新颖词嵌入的机器学习方法的比较。

Mining fall-related information in clinical notes: Comparison of rule-based and novel word embedding-based machine learning approaches.

作者信息

Topaz Maxim, Murga Ludmila, Gaddis Katherine M, McDonald Margaret V, Bar-Bachar Ofrit, Goldberg Yoav, Bowles Kathryn H

机构信息

School of Nursing & Data Science Institute, Columbia University, New York, NY, USA; The Visiting Nurse Service of New York, New York, NY, USA.

Cheryl Spencer Department of Nursing, University of Haifa, Haifa, Israel.

出版信息

J Biomed Inform. 2019 Feb;90:103103. doi: 10.1016/j.jbi.2019.103103. Epub 2019 Jan 9.

DOI:10.1016/j.jbi.2019.103103

PMID:30639392

Abstract

BACKGROUND

Natural language processing (NLP) of health-related data is still an expertise demanding, and resource expensive process. We created a novel, open source rapid clinical text mining system called NimbleMiner. NimbleMiner combines several machine learning techniques (word embedding models and positive only labels learning) to facilitate the process in which a human rapidly performs text mining of clinical narratives, while being aided by the machine learning components.

OBJECTIVE

This manuscript describes the general system architecture and user Interface and presents results of a case study aimed at classifying fall-related information (including fall history, fall prevention interventions, and fall risk) in homecare visit notes.

METHODS

We extracted a corpus of homecare visit notes (n = 1,149,586) for 89,459 patients from a large US-based homecare agency. We used a gold standard testing dataset of 750 notes annotated by two human reviewers to compare the NimbleMiner's ability to classify documents regarding whether they contain fall-related information with a previously developed rule-based NLP system.

RESULTS

NimbleMiner outperformed the rule-based system in almost all domains. The overall F- score was 85.8% compared to 81% by the rule based-system with the best performance for identifying general fall history (F = 89% vs. F = 85.1% rule-based), followed by fall risk (F = 87% vs. F = 78.7% rule-based), fall prevention interventions (F = 88.1% vs. F = 78.2% rule-based) and fall within 2 days of the note date (F = 83.1% vs. F = 80.6% rule-based). The rule-based system achieved slightly better performance for fall within 2 weeks of the note date (F = 81.9% vs. F = 84% rule-based).

DISCUSSION & CONCLUSIONS: NimbleMiner outperformed other systems aimed at fall information classification, including our previously developed rule-based approach. These promising results indicate that clinical text mining can be implemented without the need for large labeled datasets necessary for other types of machine learning. This is critical for domains with little NLP developments, like nursing or allied health professions.

摘要

背景

对健康相关数据进行自然语言处理（NLP）仍然是一个需要专业知识且资源消耗大的过程。我们创建了一个名为NimbleMiner的新型开源快速临床文本挖掘系统。NimbleMiner结合了多种机器学习技术（词嵌入模型和仅正向标签学习），以促进人类在机器学习组件辅助下快速对临床叙述进行文本挖掘的过程。

目的

本文描述了该系统的总体架构和用户界面，并展示了一个案例研究的结果，该研究旨在对家庭护理访视记录中的跌倒相关信息（包括跌倒史、跌倒预防干预措施和跌倒风险）进行分类。

方法

我们从美国一家大型家庭护理机构提取了89459名患者的家庭护理访视记录语料库（n = 1149586）。我们使用了由两名人类审阅者标注的750条记录的金标准测试数据集，将NimbleMiner对文档是否包含跌倒相关信息进行分类的能力与之前开发的基于规则的NLP系统进行比较。

结果

NimbleMiner在几乎所有领域的表现都优于基于规则的系统。总体F值为85.8%，而基于规则的系统为81%，在识别一般跌倒史方面表现最佳（F = 89% 对比基于规则的F = 85.1%），其次是跌倒风险（F = 87% 对比基于规则的F = 78.7%）、跌倒预防干预措施（F = 88.1% 对比基于规则的F = 78.2%）以及记录日期后2天内的跌倒情况（F = 83.1% 对比基于规则的F = 80.6%）。基于规则的系统在记录日期后2周内的跌倒情况方面表现略好（F = 81.9% 对比基于规则的F = 84%）。

讨论与结论

NimbleMiner在跌倒信息分类方面的表现优于其他系统，包括我们之前开发的基于规则的方法。这些令人鼓舞的结果表明，临床文本挖掘无需其他类型机器学习所需的大量标注数据集即可实现。这对于像护理或相关健康专业等NLP发展较少的领域至关重要。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

挖掘临床记录中与跌倒相关的信息：基于规则和基于新颖词嵌入的机器学习方法的比较。

Mining fall-related information in clinical notes: Comparison of rule-based and novel word embedding-based machine learning approaches.

作者信息

机构信息

出版信息

BACKGROUND

OBJECTIVE

METHODS

RESULTS

背景

目的

方法

结果

讨论与结论

相似文献

引用本文的文献

挖掘临床记录中与跌倒相关的信息：基于规则和基于新颖词嵌入的机器学习方法的比较。

Mining fall-related information in clinical notes: Comparison of rule-based and novel word embedding-based machine learning approaches.

作者信息

机构信息

出版信息

BACKGROUND

OBJECTIVE

METHODS

RESULTS

背景

目的

方法

结果

讨论与结论

相似文献

引用本文的文献