College of Pharmacy, University of Minnesota, Twin Cities, Minneapolis, MN 55455, USA.
J Biomed Inform. 2011 Apr;44(2):251-65. doi: 10.1016/j.jbi.2010.10.004. Epub 2010 Oct 31.
Our objective is to develop a framework for creating reference standards for functional testing of computerized measures of semantic relatedness. Currently, research on computerized approaches to semantic relatedness between biomedical concepts relies on reference standards created for specific purposes using a variety of methods for their analysis. In most cases, these reference standards are not publicly available and the published information provided in manuscripts that evaluate computerized semantic relatedness measurement approaches is not sufficient to reproduce the results. Our proposed framework is based on the experiences of medical informatics and computational linguistics communities and addresses practical and theoretical issues with creating reference standards for semantic relatedness. We demonstrate the use of the framework on a pilot set of 101 medical term pairs rated for semantic relatedness by 13 medical coding experts. While the reliability of this particular reference standard is in the "moderate" range; we show that using clustering and factor analyses offers a data-driven approach to finding systematic differences among raters and identifying groups of potential outliers. We test two ontology-based measures of relatedness and provide both the reference standard containing individual ratings and the R program used to analyze the ratings as open-source. Currently, these resources are intended to be used to reproduce and compare results of studies involving computerized measures of semantic relatedness. Our framework may be extended to the development of reference standards in other research areas in medical informatics including automatic classification, information retrieval from medical records and vocabulary/ontology development.
我们的目标是为计算机化语义关联功能测试创建参考标准制定框架。目前,关于计算机化生物医学概念之间语义关联的研究依赖于为特定目的创建的参考标准,这些标准采用了各种方法进行分析。在大多数情况下,这些参考标准并未公开,并且评估计算机化语义关联测量方法的手稿中提供的已发表信息不足以重现结果。我们提出的框架基于医学信息学和计算语言学社区的经验,并解决了为语义关联创建参考标准的实际和理论问题。我们在一个由 13 名医学编码专家对 101 对医学术语对进行语义关联评分的试点集中展示了该框架的使用。虽然该特定参考标准的可靠性处于“中等”范围;但我们表明,使用聚类和因子分析可以提供一种数据驱动的方法来发现评分者之间的系统差异,并识别潜在异常值的群体。我们测试了两种基于本体的关联度量方法,并提供了包含个人评分的参考标准以及用于分析评分的 R 程序作为开源资源。目前,这些资源旨在用于重现和比较涉及计算机化语义关联测量的研究结果。我们的框架可以扩展到医学信息学中其他研究领域的参考标准制定,包括自动分类、从医疗记录中检索信息以及词汇/本体开发。