Fralin Biomedical Research Institute at VTC.
Exp Clin Psychopharmacol. 2022 Aug;30(4):409-414. doi: 10.1037/pha0000549. Epub 2022 Feb 17.
Crowdsourced methods of data collection such as Amazon Mechanical Turk (MTurk) have been widely adopted in addiction science. Recent reports suggest an increase in poor quality data on MTurk, posing a challenge to the validity of findings. However, empirical investigations of data quality in addiction-related samples are lacking. In this study of individuals with alcohol use disorder (AUD), we compared poor quality delay discounting data to randomly generated data. A reanalysis of prior published delay discounting data was conducted comparing included, excluded, and randomly generated data samples. Nonsystematic criteria were implemented as a measure of data quality. The excluded data was statistically different from the included sample but did not differ from randomly generated data on multiple metrics. Moreover, a response bias was identified in the excluded data. This study provides empirical evidence that poor quality delay discounting data in an AUD sample is not statistically different from randomly generated data, suggesting data quality concerns on MTurk persist in addiction samples. These findings support the use of rigorous methods of a priori defined criteria to remove poor quality data post hoc. Additionally, it highlights that the use of nonsystematic delay discounting criteria to remove poor quality data is rigorous and not simply a way of removing data that does not conform to an expected theoretical model. (PsycInfo Database Record (c) 2022 APA, all rights reserved).
众包数据收集方法,如亚马逊 Mechanical Turk(MTurk),已在成瘾科学中得到广泛应用。最近的报告表明,MTurk 上的数据质量较差的情况有所增加,这对研究结果的有效性构成了挑战。然而,在与成瘾相关的样本中,关于数据质量的实证研究还很缺乏。在这项对酒精使用障碍(AUD)个体的研究中,我们将 MTurk 上质量较差的延迟折扣数据与随机生成的数据进行了比较。对之前发表的延迟折扣数据进行了重新分析,比较了纳入、排除和随机生成的数据样本。非系统性标准被用作数据质量的衡量标准。排除的数据在统计学上与纳入的样本不同,但在多个指标上与随机生成的数据没有差异。此外,在排除的数据中还发现了一种反应偏差。这项研究提供了实证证据,表明 AUD 样本中质量较差的延迟折扣数据在统计学上与随机生成的数据没有区别,这表明在成瘾样本中,MTurk 上的数据质量问题仍然存在。这些发现支持在事后使用严格的、基于先验定义标准的方法来去除低质量数据。此外,它还强调了使用非系统性的延迟折扣标准来去除低质量数据是严格的,而不仅仅是一种去除不符合预期理论模型的数据的方法。