通过特定领域微调的大语言模型提高医学编码效率。

Enhancing medical coding efficiency through domain-specific fine-tuned large language models.

作者信息

Hou Zhen, Liu Hao, Bian Jiang, He Xing, Zhuang Yan

机构信息

Department of Biomedical Engineering and Informatics, Luddy School of Informatics, Computing, and Engineering, Indiana University, Indianapolis, IN USA.

School of Computing, College of Science and Mathematics, Montclair State University, Montclair, NJ USA.

出版信息

Npj Health Syst. 2025;2(1):14. doi: 10.1038/s44401-025-00018-3. Epub 2025 May 1.

DOI:10.1038/s44401-025-00018-3

PMID:40321467

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12045799/

Abstract

Medical coding is essential for healthcare operations yet remains predominantly manual, error-prone (up to 20%), and costly (up to $18.2 billion annually). Although large language models (LLMs) have shown promise in natural language processing, their application to medical coding has produced limited accuracy. In this study, we evaluated whether fine-tuning LLMs with specialized ICD-10 knowledge can automate code generation across clinical documentation. We adopted a two-phase approach: initial fine-tuning using 74,260 ICD-10 code-description pairs, followed by enhanced training to address linguistic and lexical variations. Evaluations using a proprietary model (GPT-4o mini) on a cloud platform and an open-source model (Llama) on local GPUs demonstrated that initial fine-tuning increased exact matching from <1% to 97%, while enhanced fine-tuning further improved performance in complex scenarios, with real-world clinical notes achieving 69.20% exact match and 87.16% category match. These findings indicate that domain-specific fine-tuned LLMs can reduce manual burdens and improve reliability.

摘要

医学编码对医疗保健运营至关重要，但目前仍主要依赖人工操作，容易出错（错误率高达20%）且成本高昂（每年高达182亿美元）。尽管大语言模型（LLMs）在自然语言处理方面已展现出潜力，但其在医学编码中的应用准确性有限。在本研究中，我们评估了使用专门的ICD - 10知识对大语言模型进行微调是否能够在临床文档中自动生成编码。我们采用了两阶段方法：首先使用74,260个ICD - 10编码 - 描述对进行初始微调，然后进行强化训练以解决语言和词汇变化问题。在云平台上使用专有模型（GPT - 4o mini）以及在本地GPU上使用开源模型（Llama）进行的评估表明，初始微调将精确匹配率从<1%提高到了97%，而强化微调在复杂场景中进一步提升了性能，在真实世界临床记录中实现了69.20%的精确匹配和87.16%的类别匹配。这些发现表明，特定领域的微调大语言模型可以减轻人工负担并提高可靠性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5c68/12045799/658c642705e6/44401_2025_18_Fig1_HTML.jpg

相似文献

Enhancing medical coding efficiency through domain-specific fine-tuned large language models.

Npj Health Syst. 2025;2(1):14. doi: 10.1038/s44401-025-00018-3. Epub 2025 May 1.

A dataset and benchmark for hospital course summarization with adapted large language models.

J Am Med Inform Assoc. 2025 Mar 1;32(3):470-479. doi: 10.1093/jamia/ocae312.

Toward Cross-Hospital Deployment of Natural Language Processing Systems: Model Development and Validation of Fine-Tuned Large Language Models for Disease Name Recognition in Japanese.

JMIR Med Inform. 2025 Jul 8;13:e76773. doi: 10.2196/76773.

Fine-tuning open-source large language models to improve their performance on radiation oncology tasks: A feasibility study to investigate their potential clinical applications in radiation oncology.

Med Phys. 2025 Jul;52(7):e17985. doi: 10.1002/mp.17985.

Fine-tuning medical language models for enhanced long-contextual understanding and domain expertise.

Quant Imaging Med Surg. 2025 Jun 6;15(6):5450-5462. doi: 10.21037/qims-2024-2655. Epub 2025 Jun 3.

Assessing Retrieval-Augmented Large Language Model Performance in Emergency Department ICD-10-CM Coding Compared to Human Coders.

medRxiv. 2024 Oct 17:2024.10.15.24315526. doi: 10.1101/2024.10.15.24315526.

Advancing entity recognition in biomedicine via instruction tuning of large language models.

Bioinformatics. 2024 Mar 29;40(4). doi: 10.1093/bioinformatics/btae163.

A comparative study of recent large language models on generating hospital discharge summaries for lung cancer patients.

J Biomed Inform. 2025 Aug;168:104867. doi: 10.1016/j.jbi.2025.104867. Epub 2025 Jun 20.

Short-Term Memory Impairment

Menstrual Health Education Using a Specialized Large Language Model in India: Development and Evaluation Study of MenstLLaMA.

J Med Internet Res. 2025 Jul 16;27:e71977. doi: 10.2196/71977.

引用本文的文献

Large Language Models for Psychiatric Phenotype Extraction from Electronic Health Records.

medRxiv. 2025 Aug 12:2025.08.07.25333172. doi: 10.1101/2025.08.07.25333172.

本文引用的文献

Large language models vs human for classifying clinical documents.

Int J Med Inform. 2025 Mar;195:105800. doi: 10.1016/j.ijmedinf.2025.105800. Epub 2025 Jan 21.

Large language models in health care: Development, applications, and challenges.

Health Care Sci. 2023 Jul 24;2(4):255-263. doi: 10.1002/hcs2.61. eCollection 2023 Aug.

Promoting TEFCA with Blockchain Technology: A Decentralized Approach to Patient-centered Healthcare Data Management.

AMIA Annu Symp Proc. 2024 Jan 11;2023:824-833. eCollection 2023.

Large language models in medicine.

Nat Med. 2023 Aug;29(8):1930-1940. doi: 10.1038/s41591-023-02448-8. Epub 2023 Jul 17.

Large language models encode clinical knowledge.

Nature. 2023 Aug;620(7972):172-180. doi: 10.1038/s41586-023-06291-2. Epub 2023 Jul 12.

Benefits, Limits, and Risks of GPT-4 as an AI Chatbot for Medicine.

N Engl J Med. 2023 Mar 30;388(13):1233-1239. doi: 10.1056/NEJMsr2214184.

We are not ready yet: limitations of state-of-the-art disease named entity recognizers.

J Biomed Semantics. 2022 Oct 27;13(1):26. doi: 10.1186/s13326-022-00280-6.

Automated clinical coding: what, why, and where we are?

NPJ Digit Med. 2022 Oct 22;5(1):159. doi: 10.1038/s41746-022-00705-7.

An explainable CNN approach for medical codes prediction from clinical text.

BMC Med Inform Decis Mak. 2021 Nov 16;21(Suppl 9):256. doi: 10.1186/s12911-021-01615-6.

2020 ACC/AHA Guideline for the Management of Patients With Valvular Heart Disease: A Report of the American College of Cardiology/American Heart Association Joint Committee on Clinical Practice Guidelines.

J Am Coll Cardiol. 2021 Feb 2;77(4):e25-e197. doi: 10.1016/j.jacc.2020.11.018. Epub 2020 Dec 17.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

通过特定领域微调的大语言模型提高医学编码效率。

Enhancing medical coding efficiency through domain-specific fine-tuned large language models.

作者信息

Hou Zhen, Liu Hao, Bian Jiang, He Xing, Zhuang Yan

机构信息

Department of Biomedical Engineering and Informatics, Luddy School of Informatics, Computing, and Engineering, Indiana University, Indianapolis, IN USA.

School of Computing, College of Science and Mathematics, Montclair State University, Montclair, NJ USA.

出版信息

Npj Health Syst. 2025;2(1):14. doi: 10.1038/s44401-025-00018-3. Epub 2025 May 1.

DOI:10.1038/s44401-025-00018-3

PMID:40321467

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12045799/

Abstract

摘要

通过特定领域微调的大语言模型提高医学编码效率。

Enhancing medical coding efficiency through domain-specific fine-tuned large language models.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

通过特定领域微调的大语言模型提高医学编码效率。

Enhancing medical coding efficiency through domain-specific fine-tuned large language models.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献