• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

诊断码分配:模型和评估指标。

Diagnosis code assignment: models and evaluation metrics.

机构信息

Department of Biomedical Informatics, Columbia University, New York, New York, USA.

出版信息

J Am Med Inform Assoc. 2014 Mar-Apr;21(2):231-7. doi: 10.1136/amiajnl-2013-002159. Epub 2013 Dec 2.

DOI:10.1136/amiajnl-2013-002159
PMID:24296907
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3932472/
Abstract

BACKGROUND AND OBJECTIVE

The volume of healthcare data is growing rapidly with the adoption of health information technology. We focus on automated ICD9 code assignment from discharge summary content and methods for evaluating such assignments.

METHODS

We study ICD9 diagnosis codes and discharge summaries from the publicly available Multiparameter Intelligent Monitoring in Intensive Care II (MIMIC II) repository. We experiment with two coding approaches: one that treats each ICD9 code independently of each other (flat classifier), and one that leverages the hierarchical nature of ICD9 codes into its modeling (hierarchy-based classifier). We propose novel evaluation metrics, which reflect the distances among gold-standard and predicted codes and their locations in the ICD9 tree. Experimental setup, code for modeling, and evaluation scripts are made available to the research community.

RESULTS

The hierarchy-based classifier outperforms the flat classifier with F-measures of 39.5% and 27.6%, respectively, when trained on 20,533 documents and tested on 2282 documents. While recall is improved at the expense of precision, our novel evaluation metrics show a more refined assessment: for instance, the hierarchy-based classifier identifies the correct sub-tree of gold-standard codes more often than the flat classifier. Error analysis reveals that gold-standard codes are not perfect, and as such the recall and precision are likely underestimated.

CONCLUSIONS

Hierarchy-based classification yields better ICD9 coding than flat classification for MIMIC patients. Automated ICD9 coding is an example of a task for which data and tools can be shared and for which the research community can work together to build on shared models and advance the state of the art.

摘要

背景与目的

随着健康信息技术的采用,医疗保健数据的数量迅速增长。我们专注于从出院总结内容中自动分配 ICD9 代码,以及评估此类分配的方法。

方法

我们研究了公共可用的 Multiparameter Intelligent Monitoring in Intensive Care II (MIMIC II) 存储库中的 ICD9 诊断代码和出院总结。我们尝试了两种编码方法:一种是独立处理每个 ICD9 代码的方法(平面分类器),另一种是利用 ICD9 代码的层次结构对其进行建模的方法(基于层次结构的分类器)。我们提出了新的评估指标,反映了金标准和预测代码之间的距离及其在 ICD9 树中的位置。实验设置、建模代码和评估脚本可供研究社区使用。

结果

基于层次结构的分类器在 20533 个文档上进行训练并在 2282 个文档上进行测试时,其 F 度量分别为 39.5%和 27.6%,优于平面分类器。虽然召回率提高了,但精度却有所下降,但我们的新评估指标提供了更精细的评估:例如,基于层次结构的分类器比平面分类器更经常识别出金标准代码的正确子树。错误分析表明,金标准代码并不完美,因此召回率和精度可能被低估。

结论

基于层次结构的分类比 MIMIC 患者的平面分类产生更好的 ICD9 编码。自动 ICD9 编码是一个可以共享数据和工具的任务的示例,研究社区可以共同努力,基于共享模型进行构建并推进技术的发展。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/abde/3932472/366ed5285bc5/amiajnl-2013-002159f05.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/abde/3932472/a3054d2fe0d7/amiajnl-2013-002159f01.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/abde/3932472/12eba284d8ec/amiajnl-2013-002159f02.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/abde/3932472/83cf980f2b6c/amiajnl-2013-002159f03.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/abde/3932472/2168bde8e53c/amiajnl-2013-002159f04.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/abde/3932472/366ed5285bc5/amiajnl-2013-002159f05.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/abde/3932472/a3054d2fe0d7/amiajnl-2013-002159f01.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/abde/3932472/12eba284d8ec/amiajnl-2013-002159f02.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/abde/3932472/83cf980f2b6c/amiajnl-2013-002159f03.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/abde/3932472/2168bde8e53c/amiajnl-2013-002159f04.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/abde/3932472/366ed5285bc5/amiajnl-2013-002159f05.jpg

相似文献

1
Diagnosis code assignment: models and evaluation metrics.诊断码分配:模型和评估指标。
J Am Med Inform Assoc. 2014 Mar-Apr;21(2):231-7. doi: 10.1136/amiajnl-2013-002159. Epub 2013 Dec 2.
2
Creating a computer assisted ICD coding system: Performance metric choice and use of the ICD hierarchy.创建计算机辅助 ICD 编码系统:性能指标的选择和 ICD 层次结构的使用。
J Biomed Inform. 2024 Apr;152:104617. doi: 10.1016/j.jbi.2024.104617. Epub 2024 Mar 1.
3
Explainable automated coding of clinical notes using hierarchical label-wise attention networks and label embedding initialisation.使用分层标签分类注意力网络和标签嵌入初始化来实现临床笔记的可解释自动化编码。
J Biomed Inform. 2021 Apr;116:103728. doi: 10.1016/j.jbi.2021.103728. Epub 2021 Mar 9.
4
Can GPT-3.5 generate and code discharge summaries?GPT-3.5 可以生成和编写出院小结吗?
J Am Med Inform Assoc. 2024 Oct 1;31(10):2284-2293. doi: 10.1093/jamia/ocae132.
5
An empirical evaluation of supervised learning approaches in assigning diagnosis codes to electronic medical records.对监督学习方法在为电子病历分配诊断代码中的实证评估。
Artif Intell Med. 2015 Oct;65(2):155-66. doi: 10.1016/j.artmed.2015.04.007. Epub 2015 May 15.
6
A hierarchical method to automatically encode Chinese diagnoses through semantic similarity estimation.一种通过语义相似性估计自动编码中文诊断的分层方法。
BMC Med Inform Decis Mak. 2016 Mar 3;16:30. doi: 10.1186/s12911-016-0269-4.
7
Enhanced ICD-10 code assignment of clinical texts: A summarization-based approach.增强临床文本的 ICD-10 编码分配:基于总结的方法。
Artif Intell Med. 2024 Oct;156:102967. doi: 10.1016/j.artmed.2024.102967. Epub 2024 Aug 20.
8
An empirical evaluation of deep learning for ICD-9 code assignment using MIMIC-III clinical notes.基于 MIMIC-III 临床记录的深度学习方法在 ICD-9 编码任务中的实证评估
Comput Methods Programs Biomed. 2019 Aug;177:141-153. doi: 10.1016/j.cmpb.2019.05.024. Epub 2019 May 25.
9
Automated ICD-9 Coding via A Deep Learning Approach.基于深度学习的自动化 ICD-9 编码。
IEEE/ACM Trans Comput Biol Bioinform. 2019 Jul-Aug;16(4):1193-1202. doi: 10.1109/TCBB.2018.2817488. Epub 2018 Mar 20.
10
Automated ICD-10 code assignment of nonstandard diagnoses via a two-stage framework.通过两阶段框架对非标准诊断进行自动ICD-10编码分配
Artif Intell Med. 2020 Aug;108:101939. doi: 10.1016/j.artmed.2020.101939. Epub 2020 Aug 15.

引用本文的文献

1
Enhancing medical coding efficiency through domain-specific fine-tuned large language models.通过特定领域微调的大语言模型提高医学编码效率。
Npj Health Syst. 2025;2(1):14. doi: 10.1038/s44401-025-00018-3. Epub 2025 May 1.
2
Digitalising the past decades: automated ICD-10 coding of unstructured free text dermatological diagnoses.数字化过去几十年:非结构化自由文本皮肤科诊断的自动化 ICD-10 编码。
BMC Health Serv Res. 2024 Oct 29;24(1):1297. doi: 10.1186/s12913-024-11761-y.
3
PheWP2V: a phenome-wide prediction framework with weighted patient representations using electronic health records.

本文引用的文献

1
Temporal properties of diagnosis code time series in aggregate.总体诊断代码时间序列的时间特性。
IEEE J Biomed Health Inform. 2013 Mar;17(2):477-83. doi: 10.1109/JBHI.2013.2244610.
2
Improving the electronic health record--are clinicians getting what they wished for?改善电子健康记录——临床医生得到他们想要的了吗?
JAMA. 2013 Mar 13;309(10):991-2. doi: 10.1001/jama.2013.890.
3
BioPortal: enhanced functionality via new Web services from the National Center for Biomedical Ontology to access and use ontologies in software applications.
PheWP2V:一种利用电子健康记录的加权患者表征进行全表型预测的框架。
JAMIA Open. 2024 Sep 14;7(3):ooae084. doi: 10.1093/jamiaopen/ooae084. eCollection 2024 Oct.
4
ICDXML: enhancing ICD coding with probabilistic label trees and dynamic semantic representations.ICDXML:利用概率标签树和动态语义表示增强 ICD 编码。
Sci Rep. 2024 Aug 7;14(1):18319. doi: 10.1038/s41598-024-69214-9.
5
AnEMIC: A Framework for Benchmarking ICD Coding Models.贫血:一种用于对ICD编码模型进行基准测试的框架。
Proc Conf Empir Methods Nat Lang Process. 2022 Dec;2022(SD):109-120. doi: 10.18653/v1/2022.emnlp-demos.11.
6
Choice of refractive surgery types for myopia assisted by machine learning based on doctors' surgical selection data.基于医生手术选择数据的机器学习辅助近视屈光手术类型选择。
BMC Med Inform Decis Mak. 2024 Feb 8;24(1):41. doi: 10.1186/s12911-024-02451-0.
7
A Curriculum Batching Strategy for Automatic ICD Coding with Deep Multi-Label Classification Models.一种用于深度多标签分类模型的自动ICD编码的课程批处理策略
Healthcare (Basel). 2022 Nov 29;10(12):2397. doi: 10.3390/healthcare10122397.
8
Classification of user queries according to a hierarchical medical procedure encoding system using an ensemble classifier.使用集成分类器根据分层医疗程序编码系统对用户查询进行分类。
Front Artif Intell. 2022 Nov 4;5:1000283. doi: 10.3389/frai.2022.1000283. eCollection 2022.
9
Comparing Deep Learning and Conventional Machine Learning Models for Predicting Mental Illness from History of Present Illness Notations.比较深度学习模型和传统机器学习模型用于根据现病史记录预测精神疾病的情况。
AMIA Annu Symp Proc. 2022 Feb 21;2021:1109-1118. eCollection 2021.
10
Medical code prediction via capsule networks and ICD knowledge.基于胶囊网络和 ICD 知识的医疗编码预测。
BMC Med Inform Decis Mak. 2021 Jul 30;21(Suppl 2):55. doi: 10.1186/s12911-021-01426-9.
生物信息学知识库:通过国家生物医学本体学研究中心提供的新 Web 服务增强功能,以便在软件应用程序中访问和使用本体。
Nucleic Acids Res. 2011 Jul;39(Web Server issue):W541-5. doi: 10.1093/nar/gkr469. Epub 2011 Jun 14.
4
Multiparameter Intelligent Monitoring in Intensive Care II: a public-access intensive care unit database.多参数智能监护在重症监护中的应用 II:一个公共接入重症监护病房数据库。
Crit Care Med. 2011 May;39(5):952-60. doi: 10.1097/CCM.0b013e31820a92c6.
5
A systematic literature review of automated clinical coding and classification systems.自动化临床编码和分类系统的系统文献回顾。
J Am Med Inform Assoc. 2010 Nov-Dec;17(6):646-51. doi: 10.1136/jamia.2009.001024.
6
Robust replication of genotype-phenotype associations across multiple diseases in an electronic medical record.在电子病历中,多种疾病的基因型-表型关联具有强大的复制能力。
Am J Hum Genet. 2010 Apr 9;86(4):560-72. doi: 10.1016/j.ajhg.2010.03.003. Epub 2010 Apr 1.
7
PheWAS: demonstrating the feasibility of a phenome-wide scan to discover gene-disease associations.表型-全基因组关联研究:探索表型-全基因组关联研究发现基因-疾病关联的可行性。
Bioinformatics. 2010 May 1;26(9):1205-10. doi: 10.1093/bioinformatics/btq126. Epub 2010 Mar 24.
8
Three approaches to automatic assignment of ICD-9-CM codes to radiology reports.将ICD-9-CM编码自动分配到放射学报告的三种方法。
AMIA Annu Symp Proc. 2007 Oct 11;2007:279-83.
9
Automatic construction of rule-based ICD-9-CM coding systems.基于规则的ICD-9-CM编码系统的自动构建。
BMC Bioinformatics. 2008 Apr 11;9 Suppl 3(Suppl 3):S10. doi: 10.1186/1471-2105-9-S3-S10.
10
Probing genetic overlap among complex human phenotypes.探究复杂人类表型之间的遗传重叠。
Proc Natl Acad Sci U S A. 2007 Jul 10;104(28):11694-9. doi: 10.1073/pnas.0704820104. Epub 2007 Jul 3.