• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

迈向自动化临床编码。

Towards automated clinical coding.

机构信息

University College London, Gower Street, London WC1E 6BT, UK.

出版信息

Int J Med Inform. 2018 Dec;120:50-61. doi: 10.1016/j.ijmedinf.2018.09.021. Epub 2018 Oct 2.

DOI:10.1016/j.ijmedinf.2018.09.021
PMID:30409346
Abstract

BACKGROUND

Patients' encounters with healthcare services must undergo clinical coding. These codes are typically derived from free-text notes. Manual clinical coding is expensive, time-consuming and prone to error. Automated clinical coding systems have great potential to save resources, and realtime availability of codes would improve oversight of patient care and accelerate research. Automated coding is made challenging by the idiosyncrasies of clinical text, the large number of disease codes and their unbalanced distribution.

METHODS

We explore methods for representing clinical text and the labels in hierarchical clinical coding ontologies. Text is represented as term frequency-inverse document frequency counts and then as word embeddings, which we use as input to recurrent neural networks. Labels are represented atomically, and then by learning representations of each node in a coding ontology and composing a representation for each label from its respective node path. We consider different strategies for initialisation of the node representations. We evaluate our methods using the publicly-available Medical Information Mart for Intensive Care III dataset: we extract the history of presenting illness section from each discharge summary in the dataset, then predicting the International Classification of Diseases, ninth revision, Clinical Modification codes associated with these.

RESULTS

Composing the label representations from the clinical-coding-ontology nodes increased weighted F1 for prediction of the 17,561 disease labels to 0.264-0.281 from 0.232-0.249 for atomic representations. Recurrent neural network text representation improved weighted F1 for prediction of the 19 disease-category labels to 0.682-0.701 from 0.662-0.682 using term frequency-inverse document frequency. However, term frequency-inverse document frequency outperformed recurrent neural networks for prediction of the 17,561 disease labels.

CONCLUSIONS

This study demonstrates that hierarchically-structured medical knowledge can be incorporated into statistical models, and produces improved performance during automated clinical coding. This performance improvement results primarily from improved representation of rarer diseases. We also show that recurrent neural networks improve representation of medical text in some settings. Learning good representations of the very rare diseases in clinical coding ontologies from data alone remains challenging, and alternative means of representing these diseases will form a major focus of future work on automated clinical coding.

摘要

背景

患者与医疗服务的交互必须经过临床编码。这些代码通常源自自由文本注释。手动临床编码既昂贵又耗时,且容易出错。自动化临床编码系统具有巨大的资源节约潜力,并且代码的实时可用性将改善对患者护理的监督并加速研究。由于临床文本的特殊性、疾病代码数量庞大且分布不均,自动化编码具有挑战性。

方法

我们探索了表示临床文本和分层临床编码本体标签的方法。文本表示为词频-逆文档频率计数,然后表示为词向量,我们将其用作递归神经网络的输入。标签以原子形式表示,然后通过学习编码本体中每个节点的表示,并从其各自的节点路径为每个标签组成表示。我们考虑了节点表示初始化的不同策略。我们使用公开的重症监护医疗信息集市 III 数据集评估我们的方法:我们从数据集中的每个出院记录中提取发病史部分,然后预测与这些部分相关的国际疾病分类,第九修订版,临床修正代码。

结果

从临床编码本体节点组合标签表示,将预测 17561 种疾病标签的加权 F1 从原子表示的 0.232-0.249 提高到 0.264-0.281。使用词频-逆文档频率,递归神经网络文本表示将预测 19 种疾病类别标签的加权 F1 从 0.662-0.682 提高到 0.682-0.701。然而,词频-逆文档频率在预测 17561 种疾病标签方面优于递归神经网络。

结论

本研究表明,层次结构的医学知识可以纳入统计模型,并在自动化临床编码过程中提高性能。这种性能的提高主要源于对罕见疾病的更好表示。我们还表明,在某些情况下,递归神经网络可以改善医学文本的表示。仅从数据中学习临床编码本体中非常罕见疾病的良好表示仍然具有挑战性,并且表示这些疾病的替代方法将成为自动化临床编码未来工作的主要重点。

相似文献

1
Towards automated clinical coding.迈向自动化临床编码。
Int J Med Inform. 2018 Dec;120:50-61. doi: 10.1016/j.ijmedinf.2018.09.021. Epub 2018 Oct 2.
2
Explainable automated coding of clinical notes using hierarchical label-wise attention networks and label embedding initialisation.使用分层标签分类注意力网络和标签嵌入初始化来实现临床笔记的可解释自动化编码。
J Biomed Inform. 2021 Apr;116:103728. doi: 10.1016/j.jbi.2021.103728. Epub 2021 Mar 9.
3
Boosting ICD multi-label classification of health records with contextual embeddings and label-granularity.利用上下文嵌入和标签粒度增强 ICD 多标签健康记录分类。
Comput Methods Programs Biomed. 2020 May;188:105264. doi: 10.1016/j.cmpb.2019.105264. Epub 2019 Dec 10.
4
Deep neural models for ICD-10 coding of death certificates and autopsy reports in free-text.深度学习模型在自由文本中进行 ICD-10 死亡证明和尸检报告编码。
J Biomed Inform. 2018 Apr;80:64-77. doi: 10.1016/j.jbi.2018.02.011. Epub 2018 Feb 26.
5
DRCNN: A deep recurrent convolutional neural network with transfer learning through pre-trained embeddings for automated ICD coding.深度递归卷积神经网络:通过预训练的嵌入进行迁移学习,实现自动化 ICD 编码
Methods. 2022 Sep;205:97-105. doi: 10.1016/j.ymeth.2022.06.004. Epub 2022 Jul 1.
6
Automated ICD-9 Coding via A Deep Learning Approach.基于深度学习的自动化 ICD-9 编码。
IEEE/ACM Trans Comput Biol Bioinform. 2019 Jul-Aug;16(4):1193-1202. doi: 10.1109/TCBB.2018.2817488. Epub 2018 Mar 20.
7
Automated Diagnosis Coding with Combined Text Representations.结合文本表示的自动诊断编码
Stud Health Technol Inform. 2017;235:201-205.
8
An empirical evaluation of deep learning for ICD-9 code assignment using MIMIC-III clinical notes.基于 MIMIC-III 临床记录的深度学习方法在 ICD-9 编码任务中的实证评估
Comput Methods Programs Biomed. 2019 Aug;177:141-153. doi: 10.1016/j.cmpb.2019.05.024. Epub 2019 May 25.
9
JLAN: medical code prediction via joint learning attention networks and denoising mechanism.JLAN:基于联合学习注意力网络和去噪机制的医疗编码预测。
BMC Bioinformatics. 2021 Dec 13;22(1):590. doi: 10.1186/s12859-021-04520-x.
10
Artificial Intelligence Learning Semantics via External Resources for Classifying Diagnosis Codes in Discharge Notes.人工智能通过外部资源学习语义以对出院小结中的诊断代码进行分类。
J Med Internet Res. 2017 Nov 6;19(11):e380. doi: 10.2196/jmir.8344.

引用本文的文献

1
Classification of user queries according to a hierarchical medical procedure encoding system using an ensemble classifier.使用集成分类器根据分层医疗程序编码系统对用户查询进行分类。
Front Artif Intell. 2022 Nov 4;5:1000283. doi: 10.3389/frai.2022.1000283. eCollection 2022.
2
Conversion of Automated 12-Lead Electrocardiogram Interpretations to OMOP CDM Vocabulary.将自动化 12 导联心电图解释转换为 OMOP CDM 词汇表。
Appl Clin Inform. 2022 Aug;13(4):880-890. doi: 10.1055/s-0042-1756427. Epub 2022 Sep 21.
3
Consultation analysis: use of free text versus coded text.
会诊分析:自由文本与编码文本的使用
Health Technol (Berl). 2021;11(2):349-357. doi: 10.1007/s12553-020-00517-3. Epub 2021 Jan 24.
4
Natural language processing algorithms for mapping clinical text fragments onto ontology concepts: a systematic review and recommendations for future studies.自然语言处理算法在将临床文本片段映射到本体概念上的应用:系统评价及对未来研究的建议。
J Biomed Semantics. 2020 Nov 16;11(1):14. doi: 10.1186/s13326-020-00231-z.
5
Automated ICD coding via unsupervised knowledge integration (UNITE).基于无监督知识集成的 ICD 编码自动化(UNITE)。
Int J Med Inform. 2020 Jul;139:104135. doi: 10.1016/j.ijmedinf.2020.104135. Epub 2020 Apr 4.
6
Artificial Intelligence (AI) in Rare Diseases: Is the Future Brighter?人工智能(AI)在罕见病中的应用:未来更光明?
Genes (Basel). 2019 Nov 27;10(12):978. doi: 10.3390/genes10120978.
7
Formal Medical Knowledge Representation Supports Deep Learning Algorithms, Bioinformatics Pipelines, Genomics Data Analysis, and Big Data Processes.形式化医学知识表示支持深度学习算法、生物信息学管道、基因组数据分析和大数据处理。
Yearb Med Inform. 2019 Aug;28(1):152-155. doi: 10.1055/s-0039-1677933. Epub 2019 Aug 16.