Department of Psychology and Program in Linguistics, Wayne State University, 5057 Woodward Avenue, Detroit, MI 48202, USA.
Behav Res Methods. 2010 Feb;42(1):273-85. doi: 10.3758/BRM.42.1.273.
In Experiment 1, separate samples rated nouns on danger, using either an online survey or the same survey in person. In Experiment 2, a single sample rated words on familiarity, using both methods. Women's in-person and online ratings correlated significantly better than men's. In-person ratings correlated significantly better with existing norms in 4 of 8 instances. There were significant effects of condition on mean ratings and completion times. Ratings from participants who withdrew from the experiment correlated significantly less well with existing norms than did ratings from those who completed the whole experiment, in 12 of 16 instances. Analysis of existing data showed that a different statistical conclusion is reached depending on whether in-person or online ratings are used. Furthermore, the categorization of 17.9% (Experiment 1) and 5.3% (Experiment 2) of the items as high or low depends on which ratings are used. Ratings gathered in person and online cannot be freely substituted.
在实验 1 中,分别有两个样本通过在线调查或亲自调查的方式对名词进行危险程度评级。在实验 2 中,同一个样本通过两种方式对单词的熟悉程度进行评级。女性的亲自和在线评级相关性明显高于男性。在 8 个实例中有 4 个实例中,亲自评级与现有规范的相关性明显更好。条件对平均评级和完成时间有显著影响。在 16 个实例中有 12 个实例中,从退出实验的参与者那里获得的评级与现有规范的相关性明显低于完成整个实验的参与者的评级。对现有数据的分析表明,根据使用的是亲自评级还是在线评级,会得出不同的统计结论。此外,将 17.9%(实验 1)和 5.3%(实验 2)的项目归类为高或低取决于使用的是哪种评级。亲自收集和在线收集的评级不能随意替换。