• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种通过语义相似性估计自动编码中文诊断的分层方法。

A hierarchical method to automatically encode Chinese diagnoses through semantic similarity estimation.

作者信息

Ning Wenxin, Yu Ming, Zhang Runtong

机构信息

Health Care Services Research Center, Department of Industrial Engineering, Tsinghua University, Beijing, 100084, PR China.

Department of Information Management, School of Economics and Management, Beijing Jiaotong University, Beijing, 100084, PR China.

出版信息

BMC Med Inform Decis Mak. 2016 Mar 3;16:30. doi: 10.1186/s12911-016-0269-4.

DOI:10.1186/s12911-016-0269-4
PMID:26940992
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4778321/
Abstract

BACKGROUND

The accumulation of medical documents in China has rapidly increased in the past years. We focus on developing a method that automatically performs ICD-10 code assignment to Chinese diagnoses from the electronic medical records to support the medical coding process in Chinese hospitals.

METHODS

We propose two encoding methods: one that directly determines the desired code (flat method), and one that hierarchically determines the most suitable code until the desired code is obtained (hierarchical method). Both methods are based on instances from the standard diagnostic library, a gold standard dataset in China. For the first time, semantic similarity estimation between Chinese words are applied in the biomedical domain with the successful implementation of knowledge-based and distributional approaches. Characteristics of the Chinese language are considered in implementing distributional semantics. We test our methods against 16,330 coding instances from our partner hospital.

RESULTS

The hierarchical method outperforms the flat method in terms of accuracy and time complexity. Representing distributional semantics using Chinese characters can achieve comparable performance to the use of Chinese words. The diagnoses in the test set can be encoded automatically with micro-averaged precision of 92.57 %, recall of 89.63 %, and F-score of 91.08 %. A sharp decrease in encoding performance is observed without semantic similarity estimation.

CONCLUSION

The hierarchical nature of ICD-10 codes can enhance the performance of the automated code assignment. Semantic similarity estimation is demonstrated indispensable in dealing with Chinese medical text. The proposed method can greatly reduce the workload and improve the efficiency of the code assignment process in Chinese hospitals.

摘要

背景

在过去几年中,中国医学文档的积累迅速增加。我们专注于开发一种方法,该方法能从电子病历中自动为中文诊断分配ICD - 10编码,以支持中国医院的医学编码过程。

方法

我们提出了两种编码方法:一种是直接确定所需编码的方法(扁平方法),另一种是分层确定最合适的编码直至获得所需编码的方法(分层方法)。这两种方法均基于来自标准诊断库(中国的一个黄金标准数据集)的实例。首次将中文词语之间的语义相似性估计应用于生物医学领域,并成功实现了基于知识和分布的方法。在实现分布语义时考虑了中文的特点。我们使用来自合作医院的16330个编码实例对我们的方法进行测试。

结果

分层方法在准确性和时间复杂度方面优于扁平方法。使用汉字表示分布语义可获得与使用中文词语相当的性能。测试集中的诊断可以自动编码,微观平均精度为92.57%,召回率为89.63%,F值为91.08%。在没有语义相似性估计的情况下,观察到编码性能急剧下降。

结论

ICD - 10编码的分层性质可以提高自动编码分配的性能。语义相似性估计在处理中文医学文本中被证明是不可或缺的。所提出的方法可以大大减少中国医院编码分配过程的工作量并提高效率。

相似文献

1
A hierarchical method to automatically encode Chinese diagnoses through semantic similarity estimation.一种通过语义相似性估计自动编码中文诊断的分层方法。
BMC Med Inform Decis Mak. 2016 Mar 3;16:30. doi: 10.1186/s12911-016-0269-4.
2
A Hybrid Method for ICD-10 Auto-Coding of Chinese Diagnoses.一种用于中文诊断的ICD - 10自动编码的混合方法。
Stud Health Technol Inform. 2017;245:427-431.
3
Automatic SNOMED CT coding of Chinese clinical terms via attention-based semantic matching.通过基于注意力的语义匹配对中文临床术语进行自动SNOMED CT编码。
Int J Med Inform. 2022 Mar;159:104676. doi: 10.1016/j.ijmedinf.2021.104676. Epub 2021 Dec 28.
4
Automatic ICD code assignment of Chinese clinical notes based on multilayer attention BiRNN.基于多层注意力 BiRNN 的中文临床记录自动 ICD 编码分配。
J Biomed Inform. 2019 Mar;91:103114. doi: 10.1016/j.jbi.2019.103114. Epub 2019 Feb 12.
5
Evaluating semantic similarity between Chinese biomedical terms through multiple ontologies with score normalization: An initial study.通过多本体和分数归一化评估中文生物医学术语之间的语义相似性:一项初步研究。
J Biomed Inform. 2016 Dec;64:273-287. doi: 10.1016/j.jbi.2016.10.017. Epub 2016 Nov 1.
6
Automated ICD-10 code assignment of nonstandard diagnoses via a two-stage framework.通过两阶段框架对非标准诊断进行自动ICD-10编码分配
Artif Intell Med. 2020 Aug;108:101939. doi: 10.1016/j.artmed.2020.101939. Epub 2020 Aug 15.
7
Diagnosis code assignment: models and evaluation metrics.诊断码分配:模型和评估指标。
J Am Med Inform Assoc. 2014 Mar-Apr;21(2):231-7. doi: 10.1136/amiajnl-2013-002159. Epub 2013 Dec 2.
8
Enhanced ICD-10 code assignment of clinical texts: A summarization-based approach.增强临床文本的 ICD-10 编码分配:基于总结的方法。
Artif Intell Med. 2024 Oct;156:102967. doi: 10.1016/j.artmed.2024.102967. Epub 2024 Aug 20.
9
Using the distance between sets of hierarchical taxonomic clinical concepts to measure patient similarity.利用层级分类临床概念组之间的距离来衡量患者的相似性。
BMC Med Inform Decis Mak. 2019 Apr 25;19(1):91. doi: 10.1186/s12911-019-0807-y.
10
An empirical evaluation of supervised learning approaches in assigning diagnosis codes to electronic medical records.对监督学习方法在为电子病历分配诊断代码中的实证评估。
Artif Intell Med. 2015 Oct;65(2):155-66. doi: 10.1016/j.artmed.2015.04.007. Epub 2015 May 15.

引用本文的文献

1
Comparison of different feature extraction methods for applicable automated ICD coding.不同特征提取方法在适用的自动化 ICD 编码中的比较。
BMC Med Inform Decis Mak. 2022 Jan 12;22(1):11. doi: 10.1186/s12911-022-01753-5.
2
Automatic RadLex coding of Chinese structured radiology reports based on text similarity ensemble.基于文本相似度集成的中文结构化放射学报告的自动 RadLex 编码。
BMC Med Inform Decis Mak. 2021 Nov 16;21(Suppl 9):247. doi: 10.1186/s12911-021-01604-9.
3
Explainable Prediction of Medical Codes With Knowledge Graphs.

本文引用的文献

1
Supervised Extraction of Diagnosis Codes from EMRs: Role of Feature Selection, Data Selection, and Probabilistic Thresholding.电子病历中诊断代码的监督提取:特征选择、数据选择和概率阈值处理的作用
Proc (IEEE Int Conf Healthc Inform). 2013 Sep;2013:66-73. doi: 10.1109/ICHI.2013.15. Epub 2013 Dec 12.
2
Unsupervised Extraction of Diagnosis Codes from EMRs Using Knowledge-Based and Extractive Text Summarization Techniques.使用基于知识和抽取式文本摘要技术从电子病历中无监督提取诊断代码
Adv Artif Intell. 2013 May;7884:77-88. doi: 10.1007/978-3-642-38457-8_7.
3
Injury narrative text classification using factorization model.
利用知识图谱对医学编码进行可解释预测。
Front Bioeng Biotechnol. 2020 Aug 14;8:867. doi: 10.3389/fbioe.2020.00867. eCollection 2020.
4
Construction of a semi-automatic ICD-10 coding system.构建一个半自动 ICD-10 编码系统。
BMC Med Inform Decis Mak. 2020 Apr 15;20(1):67. doi: 10.1186/s12911-020-1085-4.
5
Automatic ICD Code Assignment based on ICD's Hierarchy Structure for Chinese Electronic Medical Records.基于ICD层次结构的中文电子病历自动ICD编码分配
AMIA Jt Summits Transl Sci Proc. 2019 May 6;2019:417-424. eCollection 2019.
6
EHR problem list clustering for improved topic-space navigation.电子健康记录问题列表聚类,改善主题空间导航。
BMC Med Inform Decis Mak. 2019 Apr 4;19(Suppl 3):72. doi: 10.1186/s12911-019-0789-9.
7
Feature extraction for phenotyping from semantic and knowledge resources.从语义和知识资源中进行表型特征提取。
J Biomed Inform. 2019 Mar;91:103122. doi: 10.1016/j.jbi.2019.103122. Epub 2019 Feb 7.
8
A Three-Phase Decision Model of Computer-Aided Coding for the Iranian Classification of Health Interventions (IRCHI).伊朗卫生干预分类(IRCHI)的计算机辅助编码三相决策模型。
Acta Inform Med. 2017 Jun;25(2):88-93. doi: 10.5455/aim.2017.25.88-93.
9
Enriching the international clinical nomenclature with Chinese daily used synonyms and concept recognition in physician notes.用中文常用同义词丰富国际临床术语表,并在医生记录中进行概念识别。
BMC Med Inform Decis Mak. 2017 May 2;17(1):54. doi: 10.1186/s12911-017-0455-z.
10
Automatic ICD-10 coding algorithm using an improved longest common subsequence based on semantic similarity.基于语义相似性的改进最长公共子序列的自动ICD-10编码算法
PLoS One. 2017 Mar 17;12(3):e0173410. doi: 10.1371/journal.pone.0173410. eCollection 2017.
基于因子分解模型的损伤叙事文本分类
BMC Med Inform Decis Mak. 2015;15 Suppl 1(Suppl 1):S5. doi: 10.1186/1472-6947-15-S1-S5. Epub 2015 May 20.
4
Evaluating semantic similarity and relatedness over the semantic grouping of clinical term pairs.基于临床术语对的语义分组评估语义相似性和相关性。
J Biomed Inform. 2015 Apr;54:329-36. doi: 10.1016/j.jbi.2014.11.014. Epub 2014 Dec 15.
5
Extracting important information from Chinese Operation Notes with natural language processing methods.运用自然语言处理方法从中文手术记录中提取重要信息。
J Biomed Inform. 2014 Apr;48:130-6. doi: 10.1016/j.jbi.2013.12.017. Epub 2014 Jan 31.
6
Diagnosis code assignment: models and evaluation metrics.诊断码分配:模型和评估指标。
J Am Med Inform Assoc. 2014 Mar-Apr;21(2):231-7. doi: 10.1136/amiajnl-2013-002159. Epub 2013 Dec 2.
7
Semantic similarity estimation in the biomedical domain: an ontology-based information-theoretic perspective.生物医学领域的语义相似度评估:基于本体的信息论视角。
J Biomed Inform. 2011 Oct;44(5):749-59. doi: 10.1016/j.jbi.2011.03.013. Epub 2011 Apr 2.
8
A systematic literature review of automated clinical coding and classification systems.自动化临床编码和分类系统的系统文献回顾。
J Am Med Inform Assoc. 2010 Nov-Dec;17(6):646-51. doi: 10.1136/jamia.2009.001024.
9
Empirical distributional semantics: methods and biomedical applications.实证分布语义学:方法与生物医学应用
J Biomed Inform. 2009 Apr;42(2):390-405. doi: 10.1016/j.jbi.2009.02.002. Epub 2009 Feb 14.
10
Extracting information from textual documents in the electronic health record: a review of recent research.从电子健康记录中的文本文件提取信息:近期研究综述
Yearb Med Inform. 2008:128-44.