• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

在不平衡多项设计中对最佳-最差数据进行评分,应用于众包语义判断。

Scoring best-worst data in unbalanced many-item designs, with applications to crowdsourcing semantic judgments.

机构信息

Department of Psychology, University of Alberta, P217 Biological Sciences Building, Edmonton, Alberta, T6G 2E9, Canada.

出版信息

Behav Res Methods. 2018 Apr;50(2):711-729. doi: 10.3758/s13428-017-0898-2.

DOI:10.3758/s13428-017-0898-2
PMID:28550657
Abstract

Best-worst scaling is a judgment format in which participants are presented with a set of items and have to choose the superior and inferior items in the set. Best-worst scaling generates a large quantity of information per judgment because each judgment allows for inferences about the rank value of all unjudged items. This property of best-worst scaling makes it a promising judgment format for research in psychology and natural language processing concerned with estimating the semantic properties of tens of thousands of words. A variety of different scoring algorithms have been devised in the previous literature on best-worst scaling. However, due to problems of computational efficiency, these scoring algorithms cannot be applied efficiently to cases in which thousands of items need to be scored. New algorithms are presented here for converting responses from best-worst scaling into item scores for thousands of items (many-item scoring problems). These scoring algorithms are validated through simulation and empirical experiments, and considerations related to noise, the underlying distribution of true values, and trial design are identified that can affect the relative quality of the derived item scores. The newly introduced scoring algorithms consistently outperformed scoring algorithms used in the previous literature on scoring many-item best-worst data.

摘要

最佳最差标度法是一种判断格式,参与者会看到一组项目,并需要在其中选择出更优和更差的项目。最佳最差标度法在每次判断中都会产生大量信息,因为每个判断都可以推断出所有未判断项目的等级值。最佳最差标度法的这一特性使其成为心理学和自然语言处理领域研究的一种很有前途的判断格式,这些研究涉及到对成千上万的单词的语义属性进行估计。在之前的最佳最差标度法文献中,已经设计了各种不同的评分算法。然而,由于计算效率的问题,这些评分算法不能有效地应用于需要对数千个项目进行评分的情况。本文提出了新的算法,用于将最佳最差标度法的响应转换为数千个项目的项目得分(多项目评分问题)。通过模拟和实证实验验证了这些评分算法,并确定了与噪声、真实值的基础分布以及试验设计相关的因素,这些因素会影响所得项目得分的相对质量。新引入的评分算法始终优于之前文献中用于多项目最佳最差数据评分的评分算法。

相似文献

1
Scoring best-worst data in unbalanced many-item designs, with applications to crowdsourcing semantic judgments.在不平衡多项设计中对最佳-最差数据进行评分,应用于众包语义判断。
Behav Res Methods. 2018 Apr;50(2):711-729. doi: 10.3758/s13428-017-0898-2.
2
The role of number of items per trial in best-worst scaling experiments.试验中项目数量在最佳最差标度实验中的作用。
Behav Res Methods. 2020 Apr;52(2):694-722. doi: 10.3758/s13428-019-01270-w.
3
When is best-worst best? A comparison of best-worst scaling, numeric estimation, and rating scales for collection of semantic norms.何时最佳最差最佳?最佳最差标度、数值估计和等级量表在语义规范收集方面的比较。
Behav Res Methods. 2018 Feb;50(1):115-133. doi: 10.3758/s13428-017-1009-0.
4
Valuing the Child Health Utility 9D: Using profile case best worst scaling methods to develop a new adolescent specific scoring algorithm.儿童健康效用9D的评估:运用轮廓案例最佳-最差标度法开发一种新的针对青少年的评分算法。
Soc Sci Med. 2016 May;157:48-59. doi: 10.1016/j.socscimed.2016.03.042. Epub 2016 Mar 31.
5
Extrapolating human judgments from skip-gram vector representations of word meaning.从词意的跳字模型向量表示中推断人类判断
Q J Exp Psychol (Hove). 2017 Aug;70(8):1603-1619. doi: 10.1080/17470218.2016.1195417. Epub 2016 Jun 24.
6
Fitting a Thurstonian IRT model to forced-choice data using Mplus.使用 Mplus 拟合迫选数据的 Thurstonian IRT 模型。
Behav Res Methods. 2012 Dec;44(4):1135-47. doi: 10.3758/s13428-012-0217-x.
7
Best-worst scaling vs. discrete choice experiments: an empirical comparison using social care data.最佳最差标度法与离散选择实验:使用社会关怀数据的实证比较。
Soc Sci Med. 2011 May;72(10):1717-27. doi: 10.1016/j.socscimed.2011.03.027. Epub 2011 Apr 5.
8
Similarity Judgment Within and Across Categories: A Comprehensive Model Comparison.范畴内和范畴间相似性判断:综合模型比较
Cogn Sci. 2021 Aug;45(8):e13030. doi: 10.1111/cogs.13030.
9
Associative judgment and vector space semantics.联想判断与向量空间语义学。
Psychol Rev. 2017 Jan;124(1):1-20. doi: 10.1037/rev0000047.
10
Multiple true-false items: a comparison of scoring algorithms.多项是非题:评分算法比较。
Adv Health Sci Educ Theory Pract. 2018 Aug;23(3):455-463. doi: 10.1007/s10459-017-9805-y. Epub 2017 Nov 30.

引用本文的文献

1
Taboo language across the globe: A multi-lab study.全球禁忌语:一项多实验室研究。
Behav Res Methods. 2024 Apr;56(4):3794-3813. doi: 10.3758/s13428-024-02376-6. Epub 2024 May 9.
2
Valence without meaning: Investigating form and semantic components in pseudowords valence.无意义的语符:在伪词情感中调查形式和语义成分。
Psychon Bull Rev. 2024 Oct;31(5):2357-2369. doi: 10.3758/s13423-024-02487-3. Epub 2024 Apr 2.
3
Shared mental representations underlie metaphorical sound concepts.共同的心理表象是隐喻声音概念的基础。
Sci Rep. 2023 Mar 30;13(1):5180. doi: 10.1038/s41598-023-32214-2.
4
Specificity ratings for Italian data.意大利数据的特异性评分。
Behav Res Methods. 2023 Oct;55(7):3531-3548. doi: 10.3758/s13428-022-01974-6. Epub 2022 Sep 26.
5
Governance of forest resource use in western Nepal: Current state and community preferences.尼泊尔西部森林资源利用的治理:现状和社区偏好。
Ambio. 2022 Jul;51(7):1711-1725. doi: 10.1007/s13280-021-01694-9. Epub 2022 Jan 15.
6
Beyond Likert ratings: Improving the robustness of developmental research measurement using best-worst scaling.超越李克特评分:使用最佳最差标度提高发展研究测量的稳健性。
Behav Res Methods. 2021 Oct;53(5):2273-2279. doi: 10.3758/s13428-021-01566-w. Epub 2021 Apr 5.
7
Best-worst scaling improves measurement of first impressions.最佳-最差标度法可改善第一印象的测量。
Cogn Res Princ Implic. 2019 Sep 23;4(1):36. doi: 10.1186/s41235-019-0183-2.