文献检索文档翻译深度研究

邀请有礼套餐&价格历史记录

新学期，新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

不再提醒

用于构建全面实时创伤观测站的深度学习Transformer模型：开发与验证研究

Deep Learning Transformer Models for Building a Comprehensive and Real-time Trauma Observatory: Development and Validation Study.

作者信息

Chenais Gabrielle, Gil-Jardiné Cédric, Touchais Hélène, Avalos Fernandez Marta, Contrand Benjamin, Tellier Eric, Combes Xavier, Bourdois Loick, Revel Philippe, Lagarde Emmanuel

机构信息

Unit 1219, Bordeaux Public Health Center, Institut National de la Santé et de la Recherche Médicale, Bordeaux, France.

Emergency Department, Bordeaux University Hospital, Bordeaux, France.

出版信息

JMIR AI. 2023 Jan 12;2:e40843. doi: 10.2196/40843.

DOI:10.2196/40843

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11041521/

Abstract

BACKGROUND

Public health surveillance relies on the collection of data, often in near-real time. Recent advances in natural language processing make it possible to envisage an automated system for extracting information from electronic health records.

OBJECTIVE

To study the feasibility of setting up a national trauma observatory in France, we compared the performance of several automatic language processing methods in a multiclass classification task of unstructured clinical notes.

METHODS

A total of 69,110 free-text clinical notes related to visits to the emergency departments of the University Hospital of Bordeaux, France, between 2012 and 2019 were manually annotated. Among these clinical notes, 32.5% (22,481/69,110) were traumas. We trained 4 transformer models (deep learning models that encompass attention mechanism) and compared them with the term frequency-inverse document frequency associated with the support vector machine method.

RESULTS

The transformer models consistently performed better than the term frequency-inverse document frequency and a support vector machine. Among the transformers, the GPTanam model pretrained with a French corpus with an additional autosupervised learning step on 306,368 unlabeled clinical notes showed the best performance with a micro F-score of 0.969.

CONCLUSIONS

The transformers proved efficient at the multiclass classification of narrative and medical data. Further steps for improvement should focus on the expansion of abbreviations and multioutput multiclass classification.

摘要

背景

公共卫生监测通常依赖于近乎实时的数据收集。自然语言处理的最新进展使得设想一个从电子健康记录中提取信息的自动化系统成为可能。

目的

为研究在法国建立一个国家创伤监测站的可行性，我们在非结构化临床记录的多类分类任务中比较了几种自动语言处理方法的性能。

方法

对2012年至2019年期间法国波尔多大学医院急诊科就诊的69110份自由文本临床记录进行了人工标注。在这些临床记录中，32.5%（22481/69110）为创伤记录。我们训练了4种变压器模型（包含注意力机制的深度学习模型），并将它们与支持向量机方法相关的词频-逆文档频率进行比较。

结果

变压器模型的表现始终优于词频-逆文档频率和支持向量机。在变压器模型中，使用法语语料库预训练并在306368份未标注临床记录上进行额外自监督学习步骤的GPTanam模型表现最佳，微F值为0.969。

结论

变压器模型在叙事和医学数据的多类分类中被证明是有效的。进一步的改进措施应集中在缩写词扩展和多输出多类分类上。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cdfb/11041521/0198522fc173/ai_v2i1e40843_fig1.jpg

相似文献

[1]

Deep Learning Transformer Models for Building a Comprehensive and Real-time Trauma Observatory: Development and Validation Study.

JMIR AI. 2023-1-12

[2]

Classifying social determinants of health from unstructured electronic health records using deep learning-based natural language processing.

J Biomed Inform. 2022-3

[3]

Extracting comprehensive clinical information for breast cancer using deep learning methods.

Int J Med Inform. 2019-10-2

[4]

Training a Deep Contextualized Language Model for International Classification of Diseases, 10th Revision Classification via Federated Learning: Model Development and Validation Study.

JMIR Med Inform. 2022-11-10

[5]

A comparative study on deep learning models for text classification of unstructured medical notes with various levels of class imbalance.

BMC Med Res Methodol. 2022-7-2

[6]

OpenDeID Pipeline for Unstructured Electronic Health Record Text Notes Based on Rules and Transformers: Deidentification Algorithm Development and Validation Study.

J Med Internet Res. 2023-12-6

[7]

A Hybrid Model for Family History Information Identification and Relation Extraction: Development and Evaluation of an End-to-End Information Extraction System.

JMIR Med Inform. 2021-4-22

[8]

Identifying signs and symptoms of urinary tract infection from emergency department clinical notes using large language models.

Acad Emerg Med. 2024-6

[9]

Task definition, annotated dataset, and supervised natural language processing models for symptom extraction from unstructured clinical notes.

J Biomed Inform. 2020-2

[10]

Transformers-sklearn: a toolkit for medical language understanding with transformer-based models.

BMC Med Inform Decis Mak. 2021-7-30

引用本文的文献

[1]

Harnessing Moderate-Sized Language Models for Reliable Patient Data Deidentification in Emergency Department Records: Algorithm Development, Validation, and Implementation Study.

JMIR AI. 2025-4-1

[2]

Evaluating the Capabilities of Generative AI Tools in Understanding Medical Papers: Qualitative Study.

JMIR Med Inform. 2024-9-4

[3]

The Role of Large Language Models in Transforming Emergency Medicine: Scoping Review.

JMIR Med Inform. 2024-5-10

本文引用的文献

[1]

Zero-Shot Clinical Acronym Expansion via Latent Meaning Cells.

Proc Mach Learn Res. 2020-12

[2]

Predicting Unplanned Readmissions Following a Hip or Knee Arthroplasty: Retrospective Observational Study.

JMIR Med Inform. 2020-11-27

[3]

End-to-End Models to Imitate Traditional Chinese Medicine Syndrome Differentiation in Lung Cancer Diagnosis: Model Development and Validation.

JMIR Med Inform. 2020-6-16

[4]

Deep learning in clinical natural language processing: a methodical review.

J Am Med Inform Assoc. 2020-3-1

[5]

Traditional Chinese medicine clinical records classification with BERT and domain specific corpora.

J Am Med Inform Assoc. 2019-12-1

[6]

Health monitoring during water scarcity in Mayotte, France, 2017.

BMC Public Health. 2019-3-12

[7]

Global, regional, and national disability-adjusted life-years (DALYs) for 359 diseases and injuries and healthy life expectancy (HALE) for 195 countries and territories, 1990-2017: a systematic analysis for the Global Burden of Disease Study 2017.

Lancet. 2018-11-10

[8]

Mapping influenza activity in emergency departments in France using Bayesian model-based geostatistics.

Influenza Other Respir Viruses. 2018-8-21

[9]

Retrospective observational study of emergency department syndromic surveillance data during air pollution episodes across London and Paris in 2014.

BMJ Open. 2018-4-19

[10]

Use of emergency department electronic medical records for automated epidemiological surveillance of suicide attempts: a French pilot study.

Int J Methods Psychiatr Res. 2016-9-15

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

推荐工具

医学文档翻译智能文献检索