Suppr超能文献

使用自然语言处理自动推断德国眼科医生信函中的 ICD-10 编码。

Automatic inference of ICD-10 codes from German ophthalmologic physicians' letters using natural language processing.

机构信息

Eye Center of the University Hospital Freiburg, Medical Faculty of the Albert-Ludwigs-University Freiburg, Freiburg, Germany.

Department of Ophthalmology, Asklepios Hospital Nord-Heidberg, Hamburg, Germany.

出版信息

Sci Rep. 2024 Apr 19;14(1):9035. doi: 10.1038/s41598-024-59926-3.

Abstract

Physicians' letters are the optimal source of diagnoses for registries. However, most registries demand for diagnosis codes such as ICD-10. We herein describe an algorithm that infers ICD-10 codes from German ophthalmologic physicians' letters. We assess the method in three German eye hospitals. Our algorithm is based on the nearest-neighbor method as well as on a large thesaurus for ICD-10 codes. This thesaurus was embedded into a Word2Vec space created from anonymized physicians' reports of the first hospital. For evaluation, each of the three hospitals sent all diagnoses taken from 100 letters. The inferred ICD-10 codes were evaluated for correctness by the senders. A total of 3332 natural language terms had been sent in (812 hospital one, 1473 hospital two, 1047 hospital three). A total of 526 non-diagnoses were excluded upfront. 2806 ICD-10 codes were inferred (771 hospital one, 1226 hospital two, 809 hospital three). In the first hospital, 98% were fully correct and 99% correct at the level of the superordinate disease concept. The percentages in hospital two were 69% and 86%. The respective numbers for hospital three were 69% and 91%. Our simple method is capable of inferring ICD-10 codes for German natural language diagnoses, especially when the embedding space has been built with physicians' letters from the same hospital. The method may yield sufficient accuracy for many tasks in the multi-centric setting and can easily be adapted to other languages/specialities.

摘要

医生的信件是注册的最佳诊断来源。然而,大多数登记处都要求使用 ICD-10 等诊断代码。我们在此描述一种从德国眼科医生的信件中推断 ICD-10 代码的算法。我们在三家德国眼科医院评估了该方法。我们的算法基于最近邻方法以及 ICD-10 代码的大型词库。该词库被嵌入到由第一家医院匿名医生报告创建的 Word2Vec 空间中。为了评估,每家医院都将从 100 封信中提取的所有诊断发送给我们。发送方评估推断出的 ICD-10 代码的正确性。共发送了 3332 个自然语言术语(812 个来自医院 1,1473 个来自医院 2,1047 个来自医院 3)。总共排除了 526 个非诊断。推断出 2806 个 ICD-10 代码(医院 1 有 771 个,医院 2 有 1226 个,医院 3 有 809 个)。在第一家医院,98%的推断结果完全正确,77%的结果在上级疾病概念上正确。医院 2 的比例分别为 69%和 86%。医院 3 的相应比例分别为 69%和 91%。我们的简单方法能够推断出德国自然语言诊断的 ICD-10 代码,尤其是当嵌入空间是使用同一家医院的医生信件构建时。该方法在多中心环境中可能具有足够的准确性,并且可以轻松适应其他语言/专业。

相似文献

2
[Automatic ICD-10 coding : Natural language processing for German MRI reports].
Radiologie (Heidelb). 2024 Oct;64(10):793-800. doi: 10.1007/s00117-024-01349-2. Epub 2024 Aug 9.
3
[Unsupervised Linkage between ICD- and Alpha-ID Codes and Real-World Diagnoses from Medical Reports by Means of the "word2vec" Method].
Klin Monbl Augenheilkd. 2019 Dec;236(12):1413-1417. doi: 10.1055/a-1023-4490. Epub 2019 Dec 5.
4
A Deep Learning Framework for Automated ICD-10 Coding.
Stud Health Technol Inform. 2021 May 27;281:347-351. doi: 10.3233/SHTI210178.
5
Auto-mapping Clinical Documents to ICD-10 using SNOMED-CT.
AMIA Jt Summits Transl Sci Proc. 2021 May 17;2021:296-304. eCollection 2021.
6
Estimating a Bias in ICD Encodings for Billing Purposes.
Stud Health Technol Inform. 2018;247:141-145.
7
[Diagnosis Coding in German Medical Practices: A Retrospective Study Using Routine Data].
Gesundheitswesen. 2018 Nov;80(11):1000-1005. doi: 10.1055/s-0043-125069. Epub 2018 Feb 12.
10
Disagreement of ICD-10 codes between a local hospital information system and a cancer registry.
Asian Pac J Cancer Prev. 2015;16(1):259-63. doi: 10.7314/apjcp.2015.16.1.259.

引用本文的文献

2
[Neurotrophic keratopathy and corneal ulcers in diabetes mellitus].
Ophthalmologie. 2025 May;122(5):378-382. doi: 10.1007/s00347-025-02198-7. Epub 2025 Mar 5.

本文引用的文献

2
Biomedical and clinical English model packages for the Stanza Python NLP library.
J Am Med Inform Assoc. 2021 Aug 13;28(9):1892-1899. doi: 10.1093/jamia/ocab090.
3
[Unsupervised Linkage between ICD- and Alpha-ID Codes and Real-World Diagnoses from Medical Reports by Means of the "word2vec" Method].
Klin Monbl Augenheilkd. 2019 Dec;236(12):1413-1417. doi: 10.1055/a-1023-4490. Epub 2019 Dec 5.
4
Deep learning in clinical natural language processing: a methodical review.
J Am Med Inform Assoc. 2020 Mar 1;27(3):457-470. doi: 10.1093/jamia/ocz200.
5
Ad Hoc Information Extraction for Clinical Data Warehouses.
Methods Inf Med. 2018 May;57(1):e22-e29. doi: 10.3414/ME17-02-0010. Epub 2018 May 25.
7
Parsing clinical text: how good are the state-of-the-art parsers?
BMC Med Inform Decis Mak. 2015;15 Suppl 1(Suppl 1):S2. doi: 10.1186/1472-6947-15-S1-S2. Epub 2015 May 20.
9
Electronic medical records for discovery research in rheumatoid arthritis.
Arthritis Care Res (Hoboken). 2010 Aug;62(8):1120-7. doi: 10.1002/acr.20184.
10
Automated de-identification of free-text medical records.
BMC Med Inform Decis Mak. 2008 Jul 24;8:32. doi: 10.1186/1472-6947-8-32.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验