Cruanes Jorge, Romá-Ferri M Teresa, Lloret Elena
Department of Software and Computer Systems, University of Alicante, Alicante, Spain.
Stud Health Technol Inform. 2012;180:255-9.
One of the current problems in the health domain is the reuse and sharing the clinical information between different professionals, as they are written in natural language using specific terminologies. To overcome this issue it is necessary to use a common terminology, like SNOMED-CT, allowing an information reuse that offers the health professionals the quickest access to quality information. In order to use this terminology all the other terminologies have to be mapped to it. One solution to perform that mapping is using a lexical similarity approach. In this paper we analyze the appropriateness of 15 lexical similarity methods for mapping a set of NANDA-I labels to a set of SMOED-CT descriptions in Spanish. Our aim is to establish how to choose the best algorithm in this domain, from the recall and the precision point of view. After running six different tests, we have established that the three best algorithms where those that maximize the recall, because they always return the best solution.
健康领域当前的问题之一是不同专业人员之间临床信息的复用与共享,因为这些信息是使用特定术语以自然语言编写的。为克服这一问题,有必要使用通用术语,如SNOMED-CT,以实现信息复用,从而使健康专业人员能够最快地获取高质量信息。为了使用该术语,所有其他术语都必须映射到它。执行这种映射的一种解决方案是使用词汇相似度方法。在本文中,我们分析了15种词汇相似度方法在将一组NANDA-I标签映射到一组西班牙语的SMOED-CT描述中的适用性。我们的目标是从召回率和精确率的角度确定如何在该领域选择最佳算法。在运行六个不同测试后,我们确定了三种最佳算法,即那些使召回率最大化的算法,因为它们总是能返回最佳解决方案。