• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过多本体和分数归一化评估中文生物医学术语之间的语义相似性:一项初步研究。

Evaluating semantic similarity between Chinese biomedical terms through multiple ontologies with score normalization: An initial study.

作者信息

Ning Wenxin, Yu Ming, Kong Dehua

机构信息

Health Care Services Research Center, Department of Industrial Engineering, Tsinghua University, Beijing 100084, China.

出版信息

J Biomed Inform. 2016 Dec;64:273-287. doi: 10.1016/j.jbi.2016.10.017. Epub 2016 Nov 1.

DOI:10.1016/j.jbi.2016.10.017
PMID:27810481
Abstract

BACKGROUND

Semantic similarity estimation significantly promotes the understanding of natural language resources and supports medical decision making. Previous studies have investigated semantic similarity and relatedness estimation between biomedical terms through resources in English, such as SNOMED-CT or UMLS. However, very limited studies focused on the Chinese language, and technology on natural language processing and text mining of medical documents in China is urgently needed. Due to the lack of a complete and publicly available biomedical ontology in China, we only have access to several modest-sized ontologies with no overlaps. Although all these ontologies do not constitute a complete coverage of biomedicine, their coverage of their respective domains is acceptable. In this paper, semantic similarity estimations between Chinese biomedical terms using these multiple non-overlapping ontologies were explored as an initial study.

METHODS

Typical path-based and information content (IC)-based similarity measures were applied on these ontologies. From the analysis of the computed similarity scores, heterogeneity in the statistical distributions of scores derived from multiple ontologies was discovered. This heterogeneity hampers the comparability of scores and the overall accuracy of similarity estimation. This problem was addressed through a novel language-independent method by combining semantic similarity estimation and score normalization. A reference standard was also created in this study.

RESULTS

Compared with the existing task-independent normalization methods, the newly developed method exhibited superior performance on most IC-based similarity measures. The accuracy of semantic similarity estimation was enhanced through score normalization. This enhancement resulted from the mitigation of heterogeneity in the similarity scores derived from multiple ontologies.

CONCLUSION

We demonstrated the potential necessity of score normalization when estimating semantic similarity using ontology-based measures. The results of this study can also be extended to other language systems to implement semantic similarity estimation in biomedicine.

摘要

背景

语义相似性估计显著促进了对自然语言资源的理解,并支持医学决策。先前的研究通过英语资源(如SNOMED-CT或UMLS)调查了生物医学术语之间的语义相似性和相关性估计。然而,针对中文的研究非常有限,中国迫切需要医学文档的自然语言处理和文本挖掘技术。由于中国缺乏完整且公开可用的生物医学本体,我们只能访问几个规模适中且无重叠的本体。尽管所有这些本体并未完全覆盖生物医学,但它们对各自领域的覆盖是可以接受的。本文将探索使用这些多个不重叠本体对中文生物医学术语进行语义相似性估计,作为一项初步研究。

方法

在这些本体上应用了典型的基于路径和基于信息内容(IC)的相似性度量。通过对计算出的相似性分数的分析,发现了多个本体得出的分数统计分布中的异质性。这种异质性阻碍了分数的可比性以及相似性估计的整体准确性。通过一种结合语义相似性估计和分数归一化的新型语言无关方法解决了这个问题。本研究还创建了一个参考标准。

结果

与现有的与任务无关的归一化方法相比,新开发的方法在大多数基于IC的相似性度量上表现出更好的性能。通过分数归一化提高了语义相似性估计的准确性。这种提高源于减轻了多个本体得出的相似性分数中的异质性。

结论

我们证明了在使用基于本体的度量估计语义相似性时进行分数归一化的潜在必要性。本研究结果也可扩展到其他语言系统,以实现生物医学中的语义相似性估计。

相似文献

1
Evaluating semantic similarity between Chinese biomedical terms through multiple ontologies with score normalization: An initial study.通过多本体和分数归一化评估中文生物医学术语之间的语义相似性:一项初步研究。
J Biomed Inform. 2016 Dec;64:273-287. doi: 10.1016/j.jbi.2016.10.017. Epub 2016 Nov 1.
2
A vector-based semantic relatedness measure using multiple relations within SNOMED CT and UMLS.基于向量的语义关联度量方法,利用 SNOMED CT 和 UMLS 中的多种关系。
J Biomed Inform. 2022 Jul;131:104118. doi: 10.1016/j.jbi.2022.104118. Epub 2022 Jun 9.
3
Semantic similarity in the biomedical domain: an evaluation across knowledge sources.生物医学领域的语义相似度:跨知识源的评估。
BMC Bioinformatics. 2012 Oct 10;13:261. doi: 10.1186/1471-2105-13-261.
4
Using ontology-based semantic similarity to facilitate the article screening process for systematic reviews.利用基于本体的语义相似性来促进系统评价的文献筛选过程。
J Biomed Inform. 2017 May;69:33-42. doi: 10.1016/j.jbi.2017.03.007. Epub 2017 Mar 14.
5
Enabling semantic similarity estimation across multiple ontologies: an evaluation in the biomedical domain.实现跨多个本体的语义相似性估计:在生物医学领域的评估。
J Biomed Inform. 2012 Feb;45(1):141-55. doi: 10.1016/j.jbi.2011.10.005. Epub 2011 Oct 28.
6
A framework for unifying ontology-based semantic similarity measures: a study in the biomedical domain.基于本体的语义相似性度量的统一框架:在生物医学领域的研究。
J Biomed Inform. 2014 Apr;48:38-53. doi: 10.1016/j.jbi.2013.11.006. Epub 2013 Nov 21.
7
HESML: a real-time semantic measures library for the biomedical domain with a reproducible survey.HESML:生物医学领域的实时语义度量库,附有可重现的调查。
BMC Bioinformatics. 2022 Jan 6;23(1):23. doi: 10.1186/s12859-021-04539-0.
8
Comparison of ontology-based semantic-similarity measures.基于本体的语义相似性度量比较。
AMIA Annu Symp Proc. 2008 Nov 6;2008:384-8.
9
Semantic similarity estimation in the biomedical domain: an ontology-based information-theoretic perspective.生物医学领域的语义相似度评估:基于本体的信息论视角。
J Biomed Inform. 2011 Oct;44(5):749-59. doi: 10.1016/j.jbi.2011.03.013. Epub 2011 Apr 2.
10
From lexical regularities to axiomatic patterns for the quality assurance of biomedical terminologies and ontologies.从词汇规律到公理模式,保障生物医学术语和本体的质量。
J Biomed Inform. 2018 Aug;84:59-74. doi: 10.1016/j.jbi.2018.06.008. Epub 2018 Jun 14.

引用本文的文献

1
Mapping Chinese Medical Entities to the Unified Medical Language System.将中国医学实体映射到统一医学语言系统。
Health Data Sci. 2023 Mar 30;3:0011. doi: 10.34133/hds.0011. eCollection 2023.
2
Multi-domain semantic similarity in biomedical research.生物医学研究中的多领域语义相似度。
BMC Bioinformatics. 2019 May 29;20(Suppl 10):246. doi: 10.1186/s12859-019-2810-9.