Al-Mubaid Hisham, Nguyen Hoa A
Univ. of Houston-Clear Lake, Houston, TX 77058, USA.
Conf Proc IEEE Eng Med Biol Soc. 2006;2006:2713-7. doi: 10.1109/IEMBS.2006.259235.
We propose a new cluster-based semantic similarity/distance measure for the biomedical domain within the framework of UMLS. The proposed measure is based mainly on the cross-modified path length feature between the concept nodes, and two new features: (1) the common specificity of two concept nodes, and (2) the local granularity of the clusters. We also applied, for comparison purpose, five existing general English ontology-based similarity measures into the biomedical domain within UMLS. The proposed measure was evaluated relative to human experts' ratings, and compared with the existing techniques using two ontologies (MeSH and SNOMED-CT) in UMLS. The experimental results confirmed the efficiency of the proposed method, and showed that our similarity measure gives the best overall results of correlation with human ratings. We show, further, that using MeSH ontology produces better semantic correlations with human experts' scores than SNOMED-CT in all of the tested measures.
我们在统一医学语言系统(UMLS)框架内,为生物医学领域提出了一种基于聚类的新语义相似性/距离度量方法。所提出的度量方法主要基于概念节点之间的交叉修正路径长度特征,以及两个新特征:(1)两个概念节点的共同特异性,和(2)聚类的局部粒度。为了进行比较,我们还将五种现有的基于英语本体的相似性度量方法应用于UMLS中的生物医学领域。相对于人类专家的评分对所提出的度量方法进行了评估,并使用UMLS中的两种本体(医学主题词表(MeSH)和医学系统命名法-临床术语(SNOMED-CT))与现有技术进行了比较。实验结果证实了所提方法的有效性,并表明我们的相似性度量方法在与人类评分的相关性方面给出了最佳的总体结果。我们进一步表明,在所有测试度量中,使用MeSH本体比SNOMED-CT与人类专家评分产生更好的语义相关性。