使用自然语言处理自动推断德国眼科医生信函中的 ICD-10 编码。

Automatic inference of ICD-10 codes from German ophthalmologic physicians' letters using natural language processing.

机构信息

Eye Center of the University Hospital Freiburg, Medical Faculty of the Albert-Ludwigs-University Freiburg, Freiburg, Germany.

Department of Ophthalmology, Asklepios Hospital Nord-Heidberg, Hamburg, Germany.

出版信息

Sci Rep. 2024 Apr 19;14(1):9035. doi: 10.1038/s41598-024-59926-3.

DOI:10.1038/s41598-024-59926-3

PMID:38641674

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11031573/

Abstract

Physicians' letters are the optimal source of diagnoses for registries. However, most registries demand for diagnosis codes such as ICD-10. We herein describe an algorithm that infers ICD-10 codes from German ophthalmologic physicians' letters. We assess the method in three German eye hospitals. Our algorithm is based on the nearest-neighbor method as well as on a large thesaurus for ICD-10 codes. This thesaurus was embedded into a Word2Vec space created from anonymized physicians' reports of the first hospital. For evaluation, each of the three hospitals sent all diagnoses taken from 100 letters. The inferred ICD-10 codes were evaluated for correctness by the senders. A total of 3332 natural language terms had been sent in (812 hospital one, 1473 hospital two, 1047 hospital three). A total of 526 non-diagnoses were excluded upfront. 2806 ICD-10 codes were inferred (771 hospital one, 1226 hospital two, 809 hospital three). In the first hospital, 98% were fully correct and 99% correct at the level of the superordinate disease concept. The percentages in hospital two were 69% and 86%. The respective numbers for hospital three were 69% and 91%. Our simple method is capable of inferring ICD-10 codes for German natural language diagnoses, especially when the embedding space has been built with physicians' letters from the same hospital. The method may yield sufficient accuracy for many tasks in the multi-centric setting and can easily be adapted to other languages/specialities.

摘要

医生的信件是注册的最佳诊断来源。然而，大多数登记处都要求使用 ICD-10 等诊断代码。我们在此描述一种从德国眼科医生的信件中推断 ICD-10 代码的算法。我们在三家德国眼科医院评估了该方法。我们的算法基于最近邻方法以及 ICD-10 代码的大型词库。该词库被嵌入到由第一家医院匿名医生报告创建的 Word2Vec 空间中。为了评估，每家医院都将从 100 封信中提取的所有诊断发送给我们。发送方评估推断出的 ICD-10 代码的正确性。共发送了 3332 个自然语言术语（812 个来自医院 1，1473 个来自医院 2，1047 个来自医院 3）。总共排除了 526 个非诊断。推断出 2806 个 ICD-10 代码（医院 1 有 771 个，医院 2 有 1226 个，医院 3 有 809 个）。在第一家医院，98%的推断结果完全正确，77%的结果在上级疾病概念上正确。医院 2 的比例分别为 69%和 86%。医院 3 的相应比例分别为 69%和 91%。我们的简单方法能够推断出德国自然语言诊断的 ICD-10 代码，尤其是当嵌入空间是使用同一家医院的医生信件构建时。该方法在多中心环境中可能具有足够的准确性，并且可以轻松适应其他语言/专业。

相似文献

Automatic inference of ICD-10 codes from German ophthalmologic physicians' letters using natural language processing.

Sci Rep. 2024 Apr 19;14(1):9035. doi: 10.1038/s41598-024-59926-3.

[Automatic ICD-10 coding : Natural language processing for German MRI reports].

Radiologie (Heidelb). 2024 Oct;64(10):793-800. doi: 10.1007/s00117-024-01349-2. Epub 2024 Aug 9.

[Unsupervised Linkage between ICD- and Alpha-ID Codes and Real-World Diagnoses from Medical Reports by Means of the "word2vec" Method].

Klin Monbl Augenheilkd. 2019 Dec;236(12):1413-1417. doi: 10.1055/a-1023-4490. Epub 2019 Dec 5.

A Deep Learning Framework for Automated ICD-10 Coding.

Stud Health Technol Inform. 2021 May 27;281:347-351. doi: 10.3233/SHTI210178.

Auto-mapping Clinical Documents to ICD-10 using SNOMED-CT.

AMIA Jt Summits Transl Sci Proc. 2021 May 17;2021:296-304. eCollection 2021.

Estimating a Bias in ICD Encodings for Billing Purposes.

Stud Health Technol Inform. 2018;247:141-145.

[Diagnosis Coding in German Medical Practices: A Retrospective Study Using Routine Data].

Gesundheitswesen. 2018 Nov;80(11):1000-1005. doi: 10.1055/s-0043-125069. Epub 2018 Feb 12.

Limitations of pulmonary embolism ICD-10 codes in emergency department administrative data: let the buyer beware.

BMC Med Res Methodol. 2017 Jun 8;17(1):89. doi: 10.1186/s12874-017-0361-1.

Automated vs. manual coding of neuroimaging reports via natural language processing, using the international classification of diseases, tenth revision.

Heliyon. 2024 May 7;10(10):e30106. doi: 10.1016/j.heliyon.2024.e30106. eCollection 2024 May 30.

Disagreement of ICD-10 codes between a local hospital information system and a cancer registry.

Asian Pac J Cancer Prev. 2015;16(1):259-63. doi: 10.7314/apjcp.2015.16.1.259.

引用本文的文献

Clinical document corpora-real ones, translated and synthetic substitutes, and assorted domain proxies: a survey of diversity in corpus design, with focus on German text data.

JAMIA Open. 2025 May 14;8(3):ooaf024. doi: 10.1093/jamiaopen/ooaf024. eCollection 2025 Jun.

[Neurotrophic keratopathy and corneal ulcers in diabetes mellitus].

Ophthalmologie. 2025 May;122(5):378-382. doi: 10.1007/s00347-025-02198-7. Epub 2025 Mar 5.

本文引用的文献

Systematic Undercoding of Diagnostic Procedures in National Inpatient Sample (NIS): A Threat to Validity Due to Surveillance Bias.

Qual Manag Health Care. 2021;30(4):226-232. doi: 10.1097/QMH.0000000000000297.

Biomedical and clinical English model packages for the Stanza Python NLP library.

J Am Med Inform Assoc. 2021 Aug 13;28(9):1892-1899. doi: 10.1093/jamia/ocab090.

[Unsupervised Linkage between ICD- and Alpha-ID Codes and Real-World Diagnoses from Medical Reports by Means of the "word2vec" Method].

Klin Monbl Augenheilkd. 2019 Dec;236(12):1413-1417. doi: 10.1055/a-1023-4490. Epub 2019 Dec 5.

Deep learning in clinical natural language processing: a methodical review.

J Am Med Inform Assoc. 2020 Mar 1;27(3):457-470. doi: 10.1093/jamia/ocz200.

Ad Hoc Information Extraction for Clinical Data Warehouses.

Methods Inf Med. 2018 May;57(1):e22-e29. doi: 10.3414/ME17-02-0010. Epub 2018 May 25.

Unlocking the Benefits of ICD-10 through Data Analytics.

J AHIMA. 2016 Jun;87(6):24-6.

Parsing clinical text: how good are the state-of-the-art parsers?

BMC Med Inform Decis Mak. 2015;15 Suppl 1(Suppl 1):S2. doi: 10.1186/1472-6947-15-S1-S2. Epub 2015 May 20.

Why do covariates defined by International Classification of Diseases codes fail to remove confounding in pharmacoepidemiologic studies among seniors?

Pharmacoepidemiol Drug Saf. 2011 Aug;20(8):858-65. doi: 10.1002/pds.2160. Epub 2011 Jun 13.

Electronic medical records for discovery research in rheumatoid arthritis.

Arthritis Care Res (Hoboken). 2010 Aug;62(8):1120-7. doi: 10.1002/acr.20184.

Automated de-identification of free-text medical records.

BMC Med Inform Decis Mak. 2008 Jul 24;8:32. doi: 10.1186/1472-6947-8-32.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

使用自然语言处理自动推断德国眼科医生信函中的 ICD-10 编码。

Automatic inference of ICD-10 codes from German ophthalmologic physicians' letters using natural language processing.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献