Suppr超能文献

利用先进的自然语言处理和深度学习技术从电子健康记录中检测心脏病风险因素。

Heart disease risk factors detection from electronic health records using advanced NLP and deep learning techniques.

机构信息

Faculty of Computers and Information, Minia University, Minia, Egypt.

出版信息

Sci Rep. 2023 May 3;13(1):7173. doi: 10.1038/s41598-023-34294-6.

Abstract

Heart disease remains the major cause of death, despite recent improvements in prediction and prevention. Risk factor identification is the main step in diagnosing and preventing heart disease. Automatically detecting risk factors for heart disease in clinical notes can help with disease progression modeling and clinical decision-making. Many studies have attempted to detect risk factors for heart disease, but none have identified all risk factors. These studies have proposed hybrid systems that combine knowledge-driven and data-driven techniques, based on dictionaries, rules, and machine learning methods that require significant human effort. The National Center for Informatics for Integrating Biology and Beyond (i2b2) proposed a clinical natural language processing (NLP) challenge in 2014, with a track (track2) focused on detecting risk factors for heart disease risk factors in clinical notes over time. Clinical narratives provide a wealth of information that can be extracted using NLP and Deep Learning techniques. The objective of this paper is to improve on previous work in this area as part of the 2014 i2b2 challenge by identifying tags and attributes relevant to disease diagnosis, risk factors, and medications by providing advanced techniques of using stacked word embeddings. The i2b2 heart disease risk factors challenge dataset has shown significant improvement by using the approach of stacking embeddings, which combines various embeddings. Our model achieved an F1 score of 93.66% by using BERT and character embeddings (CHARACTER-BERT Embedding) stacking. The proposed model has significant results compared to all other models and systems that we developed for the 2014 i2b2 challenge.

摘要

尽管在预测和预防方面最近有所改进,但心脏病仍然是主要的死亡原因。危险因素的识别是诊断和预防心脏病的主要步骤。自动检测临床记录中的心脏病危险因素有助于疾病进展建模和临床决策。许多研究都试图检测心脏病的危险因素,但都没有发现所有的危险因素。这些研究提出了混合系统,将基于字典、规则和机器学习方法的知识驱动和数据驱动技术相结合,这些方法需要大量的人力投入。国家生物信息学整合生物学与超越中心(i2b2)在 2014 年提出了一个临床自然语言处理(NLP)挑战,其中一个跟踪(track2)侧重于随着时间的推移在临床记录中检测心脏病危险因素的危险因素。临床叙述提供了大量可以使用自然语言处理和深度学习技术提取的信息。本文的目的是改进该领域以前的工作,作为 2014 年 i2b2 挑战的一部分,通过提供使用堆叠词嵌入的先进技术,识别与疾病诊断、危险因素和药物相关的标签和属性。通过使用堆叠嵌入的方法,i2b2 心脏病危险因素挑战数据集已经显示出了显著的改进,该方法结合了各种嵌入。我们的模型通过使用 BERT 和字符嵌入(CHARACTER-BERT 嵌入)堆叠,实现了 93.66%的 F1 得分。与我们为 2014 年 i2b2 挑战开发的所有其他模型和系统相比,该模型具有显著的效果。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验