Yu Bei, Willis Matt, Sun Peiyuan, Wang Jun
School of Information Studies, Syracuse University, Syracuse, NY 13244, USA.
J Med Internet Res. 2013 Jun 3;15(6):e108. doi: 10.2196/jmir.2513.
Consumer and patient participation proved to be an effective approach for medical pictogram design, but it can be costly and time-consuming. We proposed and evaluated an inexpensive approach that crowdsourced the pictogram evaluation task to Amazon Mechanical Turk (MTurk) workers, who are usually referred to as the "turkers".
To answer two research questions: (1) Is the turkers' collective effort effective for identifying design problems in medical pictograms? and (2) Do the turkers' demographic characteristics affect their performance in medical pictogram comprehension?
We designed a Web-based survey (open-ended tests) to ask 100 US turkers to type in their guesses of the meaning of 20 US pharmacopeial pictograms. Two judges independently coded the turkers' guesses into four categories: correct, partially correct, wrong, and completely wrong. The comprehensibility of a pictogram was measured by the percentage of correct guesses, with each partially correct guess counted as 0.5 correct. We then conducted a content analysis on the turkers' interpretations to identify misunderstandings and assess whether the misunderstandings were common. We also conducted a statistical analysis to examine the relationship between turkers' demographic characteristics and their pictogram comprehension performance.
The survey was completed within 3 days of our posting the task to the MTurk, and the collected data are publicly available in the multimedia appendix for download. The comprehensibility for the 20 tested pictograms ranged from 45% to 98%, with an average of 72.5%. The comprehensibility scores of 10 pictograms were strongly correlated to the scores of the same pictograms reported in another study that used oral response-based open-ended testing with local people. The turkers' misinterpretations shared common errors that exposed design problems in the pictograms. Participant performance was positively correlated with their educational level.
The results confirmed that crowdsourcing can be used as an effective and inexpensive approach for participatory evaluation of medical pictograms. Through Web-based open-ended testing, the crowd can effectively identify problems in pictogram designs. The results also confirmed that education has a significant effect on the comprehension of medical pictograms. Since low-literate people are underrepresented in the turker population, further investigation is needed to examine to what extent turkers' misunderstandings overlap with those elicited from low-literate people.
事实证明,消费者和患者参与是医学象形图设计的有效方法,但可能成本高昂且耗时。我们提出并评估了一种低成本方法,即将象形图评估任务众包给亚马逊土耳其机器人(MTurk)的工作人员,这些工作人员通常被称为“土耳其机器人”。
回答两个研究问题:(1)土耳其机器人的集体努力对于识别医学象形图中的设计问题是否有效?(2)土耳其机器人的人口统计学特征是否会影响他们在医学象形图理解方面的表现?
我们设计了一项基于网络的调查(开放式测试),要求100名美国土耳其机器人输入他们对20个美国药典象形图含义的猜测。两名评判员将土耳其机器人的猜测独立编码为四类:正确、部分正确、错误和完全错误。象形图的可理解性通过正确猜测的百分比来衡量,每个部分正确的猜测计为0.5个正确。然后,我们对土耳其机器人的解释进行了内容分析,以识别误解并评估这些误解是否常见。我们还进行了统计分析,以研究土耳其机器人的人口统计学特征与他们的象形图理解表现之间的关系。
在我们将任务发布到MTurk后的3天内完成了调查,收集到的数据可在多媒体附录中公开获取以供下载。20个测试象形图的可理解性范围为45%至98%,平均为72.5%。10个象形图的可理解性得分与另一项使用基于口头回答的开放式测试对当地人进行研究中报告的相同象形图的得分高度相关。土耳其机器人的错误解释存在共同的错误,这些错误暴露了象形图中的设计问题。参与者的表现与他们的教育水平呈正相关。
结果证实,众包可作为医学象形图参与式评估的一种有效且低成本的方法。通过基于网络的开放式测试,大众可以有效地识别象形图设计中的问题。结果还证实,教育对象形图的理解有显著影响。由于低文化水平的人在土耳其机器人人群中代表性不足,需要进一步调查以研究土耳其机器人的误解与低文化水平人群引发的误解在多大程度上重叠。