Suppr超能文献

使用Transformer对电子健康记录中的临床诊断进行自动分类。

Automated classification of clinical diagnoses in electronic health records using transformer.

作者信息

Dai Lixia, Xu Hang, Zhang Yugui

机构信息

Antai College of Economics and Management, School of Shanghai Jiao Tong University, Shanghai, China.

Vanke School of Public Health, Tsinghua University, Beijing, China.

出版信息

PLoS One. 2025 Sep 11;20(9):e0329963. doi: 10.1371/journal.pone.0329963. eCollection 2025.

Abstract

The automated classification of clinical diagnoses in electronic health records (EHRs) is critical for enhancing clinical decision-making and enabling large-scale medical research, yet existing methods struggle with heterogeneous data structures and limited annotated datasets. Current approaches fail to adequately address the dual challenges of extracting contextual medical semantics from unstructured clinical narratives while maintaining generalizability across institutions with divergent documentation practices. This study proposes a novel framework integrating three core components: a Transformer-based architecture for hierarchical feature extraction from clinical text, a multi-task learning paradigm leveraging diagnostic interdependencies, and transfer learning initialization using pretrained medical language models. Evaluation on the MIMIC-III dataset demonstrates state-of-the-art performance with 89.2% accuracy and 87.6% F1-score, outperforming conventional CNN-RNN hybrids by 8.0% in recall and showing 4.9-6.2% improvements over ablated configurations in critical metrics. The results establish that synergistic integration of contextual attention mechanisms, cross-task knowledge sharing, and medical domain adaptation effectively addresses EHR heterogeneity while reducing reliance on institution-specific annotations, providing a robust foundation for clinical decision support systems that balance accuracy with real-world implementability across diverse healthcare environments.

摘要

电子健康记录(EHR)中临床诊断的自动分类对于加强临床决策和开展大规模医学研究至关重要,然而现有方法在处理异构数据结构和有限的带注释数据集方面存在困难。当前的方法未能充分应对双重挑战,即从非结构化临床叙述中提取上下文医学语义,同时在具有不同文档记录做法的机构间保持通用性。本研究提出了一个新颖的框架,该框架集成了三个核心组件:用于从临床文本中进行分层特征提取的基于Transformer的架构、利用诊断相互依赖关系的多任务学习范式以及使用预训练医学语言模型的迁移学习初始化。在MIMIC-III数据集上的评估显示,其准确率达到89.2%,F1分数达到87.6%,展现了领先的性能,召回率比传统的CNN-RNN混合模型高出8.0%,在关键指标上比消融配置提高了4.9 - 6.2%。结果表明,上下文注意力机制、跨任务知识共享和医学领域适应性的协同集成有效地解决了EHR的异构性问题,同时减少了对特定机构注释的依赖,为临床决策支持系统提供了一个强大的基础,该系统在不同医疗环境中平衡了准确性与实际可操作性。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验