Bodenreider Olivier
U.S. National Library of Medicine, Bethesda, Maryland. National Institutes of Health & Human Services, USA.
AMIA Annu Symp Proc. 2003;2003:101-5.
To investigate three aspects of the redundancy of hierarchical relations across biomedical terminologies: 1) What proportion of the relations is redundant?, 2) Which terminologies tend to overlap with other terminologies?, and 3) Is there a link between redundancy and semantic consistency?.
Hierarchical relations are counted in the various families of terminologies integrated into the UMLS and an index of redundancy is computed for each relation. Similarity among sources is computed using the classical cosine method. Semantic consistency is evaluated by reference to the UMLS Semantic Network.
Overall, 29% of the 1,128,261 relations examined exhibit redundancy. Most similar sources include consecutive versions of terminologies. The link between redundancy and semantic consistency is weak.
Applications of these findings are discussed, including selecting sources, selecting useful relations, and auditing the categorization of UMLS concepts.
研究生物医学术语中层次关系冗余的三个方面:1)关系的冗余比例是多少?2)哪些术语倾向于与其他术语重叠?3)冗余与语义一致性之间是否存在联系?
对整合到统一医学语言系统(UMLS)中的各类术语家族中的层次关系进行计数,并为每个关系计算冗余指数。使用经典余弦方法计算来源之间的相似度。通过参考UMLS语义网络评估语义一致性。
总体而言,所检查的1,128,261个关系中有29%表现出冗余。最相似的来源包括术语的连续版本。冗余与语义一致性之间的联系较弱。
讨论了这些发现的应用,包括选择来源、选择有用的关系以及审核UMLS概念的分类。