Department of Mathematics and Statistics, University of Vermont, Burlington, VT, USA.
Vermont Complex Systems Center, University of Vermont, Burlington, VT, USA.
Sci Rep. 2022 Apr 27;12(1):6849. doi: 10.1038/s41598-022-10794-9.
Allocation strategies improve the efficiency of crowdsourcing by decreasing the work needed to complete individual tasks accurately. However, these algorithms introduce bias by preferentially allocating workers onto easy tasks, leading to sets of completed tasks that are no longer representative of all tasks. This bias challenges inference of problem-wide properties such as typical task difficulty or crowd properties such as worker completion times, important information that goes beyond the crowd responses themselves. Here we study inference about problem properties when using an allocation algorithm to improve crowd efficiency. We introduce Decision-Explicit Probability Sampling (DEPS), a novel method to perform inference of problem properties while accounting for the potential bias introduced by an allocation strategy. Experiments on real and synthetic crowdsourcing data show that DEPS outperforms baseline inference methods while still leveraging the efficiency gains of the allocation method. The ability to perform accurate inference of general properties when using non-representative data allows crowdsourcers to extract more knowledge out of a given crowdsourced dataset.
分配策略通过减少准确完成单个任务所需的工作量来提高众包的效率。然而,这些算法通过优先将工人分配到简单的任务上引入了偏差,导致完成的任务集不再代表所有任务。这种偏差挑战了对问题范围属性(例如典型任务难度)或人群属性(例如工人完成时间)的推断,这些信息超出了众包响应本身。在这里,我们研究了当使用分配算法来提高众包效率时,关于问题属性的推断。我们引入了决策明确概率抽样(Decision-Explicit Probability Sampling,DEPS),这是一种在考虑分配策略引入的潜在偏差的情况下进行问题属性推断的新方法。在真实和合成众包数据上的实验表明,DEPS 优于基线推断方法,同时仍利用了分配方法的效率增益。在使用非代表性数据时执行一般属性的准确推断的能力使众包人员能够从给定的众包数据集中提取更多的知识。