• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过众包在信息检索系统评估实验中创建可靠的相关性判断:综述

Creation of reliable relevance judgments in information retrieval systems evaluation experimentation through crowdsourcing: a review.

作者信息

Samimi Parnia, Ravana Sri Devi

机构信息

Department of Information Systems, Faculty of Computer Science and Information Technology, University of Malaya, 50603 Kuala Lumpur, Malaysia.

出版信息

ScientificWorldJournal. 2014;2014:135641. doi: 10.1155/2014/135641. Epub 2014 May 19.

DOI:10.1155/2014/135641
PMID:24977172
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4055211/
Abstract

Test collection is used to evaluate the information retrieval systems in laboratory-based evaluation experimentation. In a classic setting, generating relevance judgments involves human assessors and is a costly and time consuming task. Researchers and practitioners are still being challenged in performing reliable and low-cost evaluation of retrieval systems. Crowdsourcing as a novel method of data acquisition is broadly used in many research fields. It has been proven that crowdsourcing is an inexpensive and quick solution as well as a reliable alternative for creating relevance judgments. One of the crowdsourcing applications in IR is to judge relevancy of query document pair. In order to have a successful crowdsourcing experiment, the relevance judgment tasks should be designed precisely to emphasize quality control. This paper is intended to explore different factors that have an influence on the accuracy of relevance judgments accomplished by workers and how to intensify the reliability of judgments in crowdsourcing experiment.

摘要

测试集用于在基于实验室的评估实验中评估信息检索系统。在传统环境中,生成相关性判断需要人工评估,这是一项成本高昂且耗时的任务。研究人员和从业人员在对检索系统进行可靠且低成本的评估方面仍面临挑战。众包作为一种新型的数据获取方法,在许多研究领域中得到了广泛应用。事实证明,众包是一种廉价、快速的解决方案,也是创建相关性判断的可靠替代方法。信息检索中的众包应用之一是判断查询文档对的相关性。为了成功进行众包实验,应精确设计相关性判断任务以强调质量控制。本文旨在探讨影响工作人员完成相关性判断准确性的不同因素,以及如何在众包实验中提高判断的可靠性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5dc7/4055211/cbd63ce1d2ad/TSWJ2014-135641.004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5dc7/4055211/a70dbbb615e5/TSWJ2014-135641.001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5dc7/4055211/b16d88740bf6/TSWJ2014-135641.002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5dc7/4055211/1392180908b9/TSWJ2014-135641.003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5dc7/4055211/cbd63ce1d2ad/TSWJ2014-135641.004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5dc7/4055211/a70dbbb615e5/TSWJ2014-135641.001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5dc7/4055211/b16d88740bf6/TSWJ2014-135641.002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5dc7/4055211/1392180908b9/TSWJ2014-135641.003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5dc7/4055211/cbd63ce1d2ad/TSWJ2014-135641.004.jpg

相似文献

1
Creation of reliable relevance judgments in information retrieval systems evaluation experimentation through crowdsourcing: a review.通过众包在信息检索系统评估实验中创建可靠的相关性判断:综述
ScientificWorldJournal. 2014;2014:135641. doi: 10.1155/2014/135641. Epub 2014 May 19.
2
Evaluation of a novel Conjunctive Exploratory Navigation Interface for consumer health information: a crowdsourced comparative study.一种用于消费者健康信息的新型联合探索性导航界面的评估:一项众包比较研究。
J Med Internet Res. 2014 Feb 10;16(2):e45. doi: 10.2196/jmir.3111.
3
Objective and automated protocols for the evaluation of biomedical search engines using No Title Evaluation protocols.使用无标题评估协议评估生物医学搜索引擎的客观和自动化协议。
BMC Bioinformatics. 2008 Feb 29;9:132. doi: 10.1186/1471-2105-9-132.
4
Can masses of non-experts train highly accurate image classifiers? A crowdsourcing approach to instrument segmentation in laparoscopic images.非专业人员群体能否训练出高度准确的图像分类器?一种用于腹腔镜图像中器械分割的众包方法。
Med Image Comput Comput Assist Interv. 2014;17(Pt 2):438-45. doi: 10.1007/978-3-319-10470-6_55.
5
Query expansion with a medical ontology to improve a multimodal information retrieval system.利用医学本体进行查询扩展以改进多模态信息检索系统。
Comput Biol Med. 2009 Apr;39(4):396-403. doi: 10.1016/j.compbiomed.2009.01.012. Epub 2009 Mar 6.
6
Hybrid curation of gene-mutation relations combining automated extraction and crowdsourcing.结合自动提取和众包的基因突变关系混合策展。
Database (Oxford). 2014 Sep 22;2014. doi: 10.1093/database/bau094. Print 2014.
7
A similarity learning approach to content-based image retrieval: application to digital mammography.一种基于内容的图像检索的相似性学习方法:应用于数字乳腺摄影
IEEE Trans Med Imaging. 2004 Oct;23(10):1233-44. doi: 10.1109/TMI.2004.834601.
8
An empirical comparison of nine pattern classifiers.九种模式分类器的实证比较。
IEEE Trans Syst Man Cybern B Cybern. 2005 Oct;35(5):1079-91. doi: 10.1109/tsmcb.2005.847745.
9
A convex approach to validation-based learning of the regularization constant.一种基于验证的正则化常数学习的凸方法。
IEEE Trans Neural Netw. 2007 May;18(3):917-20. doi: 10.1109/TNN.2007.891187.
10
A comparison of decision tree ensemble creation techniques.决策树集成创建技术的比较。
IEEE Trans Pattern Anal Mach Intell. 2007 Jan;29(1):173-80. doi: 10.1109/tpami.2007.250609.

引用本文的文献

1
Crowdsourced Identification of Possible Allergy-Associated Factors: Automated Hypothesis Generation and Validation Using Crowdsourcing Services.众包识别可能的过敏相关因素:利用众包服务进行自动假设生成与验证
JMIR Res Protoc. 2017 May 16;6(5):e83. doi: 10.2196/resprot.5851.

本文引用的文献

1
Reputation Systems for Open Collaboration.开放式协作的声誉系统
Commun ACM. 2011 Aug;54(8):81-87. doi: 10.1145/1978542.1978560.
2
Conducting behavioral research on Amazon's Mechanical Turk.在亚马逊的 Mechanical Turk 上进行行为研究。
Behav Res Methods. 2012 Mar;44(1):1-23. doi: 10.3758/s13428-011-0124-6.
3
reCAPTCHA: human-based character recognition via Web security measures.reCAPTCHA:通过网络安全措施进行的基于人类的字符识别。
Science. 2008 Sep 12;321(5895):1465-8. doi: 10.1126/science.1160379. Epub 2008 Aug 14.