Digital Health China Technologies Co. Ltd., Beijing, China.
Melax Technologies, Inc, Houston, TX, USA.
Stud Health Technol Inform. 2022 Jun 6;290:42-46. doi: 10.3233/SHTI220028.
The objective of this study was to develop a hybrid method and perform an initial evaluation of mappings from the International Statistical Classification of Diseases, 10th revision, Chinese version (ICD-10-CN) to the Systematized Nomenclature of Medicine - Clinical Terms (SNOMED-CT). The methods used to perform mapping include reusing existing mappings, term similarity modeling for automatic mapping and manual review. We evaluated the results of automatic mapping and the coverage of the maps between two terminologies. Experimental results demonstrated that fine-tuning the pre-trained biomedical language model of PubmedBERT obtained the optimal performance, with a precision of 0.859, a recall of 0.773, and a F1 of 0.814. 100% 4-digit code ICD-10-CN terms were mapped to SNOMED-CT terms through exsit code mappings. Around 42.41% randomly selected 6-digit code ICD-10-CN terms had exact matches to corresponding SNOMED-CT terms, and we did not find appropriate SNOMED-CT terms for ICD grouping terms.
本研究的目的是开发一种混合方法,并对从国际疾病分类第十版中文版(ICD-10-CN)到医学系统命名法-临床术语(SNOMED-CT)的映射进行初步评估。用于执行映射的方法包括重用现有映射、用于自动映射的术语相似性建模和手动审查。我们评估了自动映射的结果以及两种术语之间的映射覆盖率。实验结果表明,微调预训练的 PubmedBERT 生物医学语言模型可获得最佳性能,精度为 0.859,召回率为 0.773,F1 为 0.814。通过现有代码映射,将 100%的 4 位数字 ICD-10-CN 代码映射到 SNOMED-CT 代码。大约 42.41%随机选择的 6 位数字 ICD-10-CN 代码与相应的 SNOMED-CT 代码完全匹配,并且我们没有找到合适的 SNOMED-CT 术语来表示 ICD 分组术语。