Suppr超能文献

相关性相似度:一种监测信息检索系统的替代方法。

Relevance similarity: an alternative means to monitor information retrieval systems.

作者信息

Dong Peng, Loh Marie, Mondry Adrian

机构信息

Medical Statistics and Epidemiology Group, Bioinformatics Institute, Singapore.

出版信息

Biomed Digit Libr. 2005 Jul 20;2:6. doi: 10.1186/1742-5581-2-6.

Abstract

BACKGROUND

Relevance assessment is a major problem in the evaluation of information retrieval systems. The work presented here introduces a new parameter, "Relevance Similarity", for the measurement of the variation of relevance assessment. In a situation where individual assessment can be compared with a gold standard, this parameter is used to study the effect of such variation on the performance of a medical information retrieval system. In such a setting, Relevance Similarity is the ratio of assessors who rank a given document same as the gold standard over the total number of assessors in the group.

METHODS

The study was carried out on a collection of Critically Appraised Topics (CATs). Twelve volunteers were divided into two groups of people according to their domain knowledge. They assessed the relevance of retrieved topics obtained by querying a meta-search engine with ten keywords related to medical science. Their assessments were compared to the gold standard assessment, and Relevance Similarities were calculated as the ratio of positive concordance with the gold standard for each topic.

RESULTS

The similarity comparison among groups showed that a higher degree of agreements exists among evaluators with more subject knowledge. The performance of the retrieval system was not significantly different as a result of the variations in relevance assessment in this particular query set.

CONCLUSION

In assessment situations where evaluators can be compared to a gold standard, Relevance Similarity provides an alternative evaluation technique to the commonly used kappa scores, which may give paradoxically low scores in highly biased situations such as document repositories containing large quantities of relevant data.

摘要

背景

相关性评估是信息检索系统评估中的一个主要问题。本文介绍了一个新参数“相关性相似度”,用于衡量相关性评估的变化。在个体评估可与金标准进行比较的情况下,该参数用于研究这种变化对医学信息检索系统性能的影响。在这种情况下,相关性相似度是将给定文档的排名与金标准相同的评估者数量与该组评估者总数的比率。

方法

该研究基于一组经严格评估的主题(CATs)进行。12名志愿者根据其领域知识被分为两组。他们评估了通过使用与医学相关的10个关键词查询元搜索引擎获得的检索主题的相关性。将他们的评估与金标准评估进行比较,并计算每个主题与金标准的正一致性比率作为相关性相似度。

结果

组间相似度比较表明,具有更多学科知识的评估者之间存在更高程度的一致性。在这个特定查询集中,由于相关性评估的变化,检索系统的性能没有显著差异。

结论

在评估者可与金标准进行比较的评估情况下,相关性相似度为常用的kappa分数提供了一种替代评估技术,在诸如包含大量相关数据的文档库等高度有偏倚的情况下,kappa分数可能会给出异常低的分数。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/53d7/1181804/122d37ddc8a1/1742-5581-2-6-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验