一项评估统一医学语言系统词汇元模式的专家研究。

An expert study evaluating the UMLS lexical metaschema.

作者信息

Zhang Li, Hripcsak George, Perl Yehoshua, Halper Michael, Geller James

机构信息

Computer Science Department, New Jersey Institute of Technology, University Heights, Newark, NJ 07102, USA.

出版信息

Artif Intell Med. 2005 Jul;34(3):219-33. doi: 10.1016/j.artmed.2005.01.002.

DOI:10.1016/j.artmed.2005.01.002

PMID:15996860

Abstract

OBJECTIVE

A metaschema is an abstraction network of the UMLS's semantic network (SN) obtained from a connected partition of its collection of semantic types. A lexical metaschema was previously derived based on a lexical partition which partitioned the SN into semantic-type groups using identical word-usage among the names of semantic types and the definitions of their respective children. In this paper, a statistical analysis methodology is presented to evaluate the lexical metaschema based on a study involving a group of established UMLS experts.

METHODS

In the study, each expert was asked to identify subject areas of the SN based on his or her understanding of the various semantic types. For this purpose, the expert scans the SN hierarchy top-down, identifying semantic types, which are important and different enough from their parent semantic types, as roots of their groups. From the response of each expert, an "expert metaschema" is constructed. The different experts' metaschemas can vary widely. So, additional metaschemas are obtained from aggregations of the experts' responses. Of special interest is the consensus metaschema which represents an aggregation of a simple majority of the experts' responses. Statistical analysis comparing the lexical metaschema with the experts' metaschemas and the consensus metaschema is presented.

RESULTS

The analysis results shows that 17 out of the 21 meta-semantic types in the lexical metaschema also appear in the consensus metaschema (about 81%). There are 107 semantic types (about 79%) covered by identical meta-semantic types and refinements. The results show the high similarity between the two metaschemas. Furthermore, the statistical analysis shows that the lexical metaschema did not grossly underperform compared to the experts.

CONCLUSION

Our study shows that the lexical metaschema provides a good approximation for a partition of meaningful subject areas in the SN, when compared to the consensus metaschema capturing the aggregation of a simple majority of the human experts' opinions.

摘要

目的

元模式是从统一医学语言系统（UMLS）语义网络（SN）的语义类型集合的连通分区中获得的抽象网络。词汇元模式先前是基于词汇分区得出的，该分区使用语义类型名称及其各自子类型定义中的相同词用法，将SN划分为语义类型组。本文提出一种统计分析方法，以基于一项涉及一组既定UMLS专家的研究来评估词汇元模式。

方法

在该研究中，要求每位专家根据其对各种语义类型的理解来确定SN的主题领域。为此，专家自上而下扫描SN层次结构，将那些重要且与其父语义类型有足够差异的语义类型识别为其组的根。根据每位专家的回答构建一个“专家元模式”。不同专家的元模式可能差异很大。因此，从专家回答的汇总中获得了额外的元模式。特别令人感兴趣的是共识元模式，它代表了专家回答的简单多数的汇总。展示了将词汇元模式与专家元模式和共识元模式进行比较的统计分析。