相关性相似度：一种监测信息检索系统的替代方法。

Relevance similarity: an alternative means to monitor information retrieval systems.

作者信息

Dong Peng, Loh Marie, Mondry Adrian

机构信息

Medical Statistics and Epidemiology Group, Bioinformatics Institute, Singapore.

出版信息

Biomed Digit Libr. 2005 Jul 20;2:6. doi: 10.1186/1742-5581-2-6.

DOI:10.1186/1742-5581-2-6

PMID:16029513

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC1181804/

Abstract

BACKGROUND

Relevance assessment is a major problem in the evaluation of information retrieval systems. The work presented here introduces a new parameter, "Relevance Similarity", for the measurement of the variation of relevance assessment. In a situation where individual assessment can be compared with a gold standard, this parameter is used to study the effect of such variation on the performance of a medical information retrieval system. In such a setting, Relevance Similarity is the ratio of assessors who rank a given document same as the gold standard over the total number of assessors in the group.

METHODS

The study was carried out on a collection of Critically Appraised Topics (CATs). Twelve volunteers were divided into two groups of people according to their domain knowledge. They assessed the relevance of retrieved topics obtained by querying a meta-search engine with ten keywords related to medical science. Their assessments were compared to the gold standard assessment, and Relevance Similarities were calculated as the ratio of positive concordance with the gold standard for each topic.

RESULTS

The similarity comparison among groups showed that a higher degree of agreements exists among evaluators with more subject knowledge. The performance of the retrieval system was not significantly different as a result of the variations in relevance assessment in this particular query set.

CONCLUSION

In assessment situations where evaluators can be compared to a gold standard, Relevance Similarity provides an alternative evaluation technique to the commonly used kappa scores, which may give paradoxically low scores in highly biased situations such as document repositories containing large quantities of relevant data.

摘要

背景

相关性评估是信息检索系统评估中的一个主要问题。本文介绍了一个新参数“相关性相似度”，用于衡量相关性评估的变化。在个体评估可与金标准进行比较的情况下，该参数用于研究这种变化对医学信息检索系统性能的影响。在这种情况下，相关性相似度是将给定文档的排名与金标准相同的评估者数量与该组评估者总数的比率。

方法

该研究基于一组经严格评估的主题（CATs）进行。12名志愿者根据其领域知识被分为两组。他们评估了通过使用与医学相关的10个关键词查询元搜索引擎获得的检索主题的相关性。将他们的评估与金标准评估进行比较，并计算每个主题与金标准的正一致性比率作为相关性相似度。

结果

组间相似度比较表明，具有更多学科知识的评估者之间存在更高程度的一致性。在这个特定查询集中，由于相关性评估的变化，检索系统的性能没有显著差异。

结论

在评估者可与金标准进行比较的评估情况下，相关性相似度为常用的kappa分数提供了一种替代评估技术，在诸如包含大量相关数据的文档库等高度有偏倚的情况下，kappa分数可能会给出异常低的分数。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/53d7/1181804/122d37ddc8a1/1742-5581-2-6-1.jpg

相似文献

Biomed Digit Libr. 2005 Jul 20;2:6. doi: 10.1186/1742-5581-2-6.

Endowing a Content-Based Medical Image Retrieval System with Perceptual Similarity Using Ensemble Strategy.

J Digit Imaging. 2016 Feb;29(1):22-37. doi: 10.1007/s10278-015-9809-1.

Comparing image search behaviour in the ARRS GoldMiner search engine and a clinical PACS/RIS.

J Biomed Inform. 2015 Aug;56:57-64. doi: 10.1016/j.jbi.2015.04.013. Epub 2015 May 19.

Learning the preferences of physicians for the organization of result lists of medical evidence articles.

Methods Inf Med. 2014;53(5):344-56. doi: 10.3414/ME13-01-0085. Epub 2014 Jun 6.

A boosting framework for visuality-preserving distance metric learning and its application to medical image retrieval.

IEEE Trans Pattern Anal Mach Intell. 2010 Jan;32(1):30-44. doi: 10.1109/TPAMI.2008.273.

Development and evaluation of a biomedical search engine using a predicate-based vector space model.

J Biomed Inform. 2013 Oct;46(5):929-39. doi: 10.1016/j.jbi.2013.07.006. Epub 2013 Jul 25.

Scientific basis of the OCRA method for risk assessment of biomechanical overload of upper limb, as preferred method in ISO standards on biomechanical risk factors.

Scand J Work Environ Health. 2018 Jul 1;44(4):436-438. doi: 10.5271/sjweh.3746.

Selective dissemination and indexing of scientific information.

Science. 1971 Jul 23;173(3994):300-8. doi: 10.1126/science.173.3994.300.

CLUE: cluster-based retrieval of images by unsupervised learning.

IEEE Trans Image Process. 2005 Aug;14(8):1187-201. doi: 10.1109/tip.2005.849770.

IEEE Trans Med Imaging. 2004 Oct;23(10):1233-44. doi: 10.1109/TMI.2004.834601.

引用本文的文献

Objective and automated protocols for the evaluation of biomedical search engines using No Title Evaluation protocols.

BMC Bioinformatics. 2008 Feb 29;9:132. doi: 10.1186/1471-2105-9-132.

The "impact factor" revisited.

Biomed Digit Libr. 2005 Dec 5;2:7. doi: 10.1186/1742-5581-2-7.

本文引用的文献

The ultimate search engine?

Nat Cell Biol. 2005 Jan;7(1):1. doi: 10.1038/ncb0105-1b.

Quantitative evaluation of recall and precision of CAT Crawler, a search engine specialized on retrieval of Critically Appraised Topics.

BMC Med Inform Decis Mak. 2004 Dec 10;4:21. doi: 10.1186/1472-6947-4-21.

A quality evaluation methodology of health web-pages for non-professionals.

Med Inform Internet Med. 2004 Jun;29(2):95-107. doi: 10.1080/14639230410001684396.

Enhanced quality and quantity of retrieval of Critically Appraised Topics using the CAT Crawler.

Med Inform Internet Med. 2004 Mar;29(1):43-55. doi: 10.1080/14639230310001655849.

Critically Appraised Topics (CAT) peer-to-peer network.

AMIA Annu Symp Proc. 2003;2003:806.

How do consumers search for and appraise information on medicines on the Internet? A qualitative study using focus groups.

J Med Internet Res. 2003 Dec 19;5(4):e33. doi: 10.2196/jmir.5.4.e33.

Measuring agreement in medical informatics reliability studies.

J Biomed Inform. 2002 Apr;35(2):99-110. doi: 10.1016/s1532-0464(02)00500-2.

International use of an academic nephrology World Wide Web site: from medical information resource to business tool.

Mil Med. 2002 Apr;167(4):326-30.

High agreement but low kappa: I. The problems of two paradoxes.

J Clin Epidemiol. 1990;43(6):543-9. doi: 10.1016/0895-4356(90)90158-l.

High agreement but low kappa: II. Resolving the paradoxes.

J Clin Epidemiol. 1990;43(6):551-8. doi: 10.1016/0895-4356(90)90159-m.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

相关性相似度：一种监测信息检索系统的替代方法。

Relevance similarity: an alternative means to monitor information retrieval systems.

作者信息

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSION

背景

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献