Ilakkuvan Vinu, Tacelosky Michael, Ivey Keith C, Pearson Jennifer L, Cantrell Jennifer, Vallone Donna M, Abrams David B, Kirchner Thomas R
Department of Research and Evaluation, Legacy, Washington, DC, United States.
JMIR Res Protoc. 2014 Apr 9;3(2):e22. doi: 10.2196/resprot.3277.
Photographs are an effective way to collect detailed and objective information about the environment, particularly for public health surveillance. However, accurately and reliably annotating (ie, extracting information from) photographs remains difficult, a critical bottleneck inhibiting the use of photographs for systematic surveillance. The advent of distributed human computation (ie, crowdsourcing) platforms represents a veritable breakthrough, making it possible for the first time to accurately, quickly, and repeatedly annotate photos at relatively low cost.
This paper describes a methods protocol, using photographs from point-of-sale surveillance studies in the field of tobacco control to demonstrate the development and testing of custom-built tools that can greatly enhance the quality of crowdsourced annotation.
Enhancing the quality of crowdsourced photo annotation requires a number of approaches and tools. The crowdsourced photo annotation process is greatly simplified by decomposing the overall process into smaller tasks, which improves accuracy and speed and enables adaptive processing, in which irrelevant data is filtered out and more difficult targets receive increased scrutiny. Additionally, zoom tools enable users to see details within photographs and crop tools highlight where within an image a specific object of interest is found, generating a set of photographs that answer specific questions. Beyond such tools, optimizing the number of raters (ie, crowd size) for accuracy and reliability is an important facet of crowdsourced photo annotation. This can be determined in a systematic manner based on the difficulty of the task and the desired level of accuracy, using receiver operating characteristic (ROC) analyses. Usability tests of the zoom and crop tool suggest that these tools significantly improve annotation accuracy. The tests asked raters to extract data from photographs, not for the purposes of assessing the quality of that data, but rather to assess the usefulness of the tool. The proportion of individuals accurately identifying the presence of a specific advertisement was higher when provided with pictures of the product's logo and an example of the ad, and even higher when also provided the zoom tool (χ(2) 2=155.7, P<.001). Similarly, when provided cropped images, a significantly greater proportion of respondents accurately identified the presence of cigarette product ads (χ(2) 1=75.14, P<.001), as well as reported being able to read prices (χ(2) 2=227.6, P<.001). Comparing the results of crowdsourced photo-only assessments to traditional field survey data, an excellent level of correspondence was found, with area under the ROC curves produced by sensitivity analyses averaging over 0.95, requiring on average 10 to 15 crowdsourced raters to achieve values of over 0.90.
Further testing and improvement of these tools and processes is currently underway. This includes conducting systematic evaluations that crowdsource photograph annotation and methodically assess the quality of raters' work.
Overall, the combination of crowdsourcing technologies with tiered data flow and tools that enhance annotation quality represents a breakthrough solution to the problem of photograph annotation, vastly expanding opportunities for the use of photographs rich in public health and other data on a scale previously unimaginable.
照片是收集有关环境的详细和客观信息的有效方式,特别是用于公共卫生监测。然而,准确且可靠地注释(即从照片中提取信息)照片仍然很困难,这是阻碍将照片用于系统监测的关键瓶颈。分布式人类计算(即众包)平台的出现代表了一项真正的突破,首次使得以相对较低的成本准确、快速且重复地注释照片成为可能。
本文描述了一种方法协议,利用烟草控制领域销售点监测研究中的照片来展示定制工具的开发和测试,这些工具可极大提高众包注释的质量。
提高众包照片注释质量需要多种方法和工具。通过将整个过程分解为较小的任务,众包照片注释过程得以大大简化,这提高了准确性和速度,并实现了自适应处理,即过滤掉无关数据,对更困难的目标进行更多审查。此外,缩放工具使用户能够查看照片中的细节,裁剪工具则突出显示在图像中找到特定感兴趣对象的位置,生成一组能回答特定问题的照片。除了这些工具,为了准确性和可靠性而优化评分者数量(即人群规模)是众包照片注释的一个重要方面。这可以基于任务的难度和期望的准确程度,使用接收者操作特征(ROC)分析以系统的方式确定。缩放和裁剪工具的可用性测试表明,这些工具显著提高了注释准确性。测试要求评分者从照片中提取数据,不是为了评估该数据的质量,而是为了评估工具的有用性。当提供产品标志图片和广告示例时,准确识别特定广告存在的个体比例更高,当还提供缩放工具时比例甚至更高(χ(2) 2 = 155.7,P <.001)。同样,当提供裁剪图像时,准确识别香烟产品广告存在的受访者比例显著更高(χ(2) 1 = 75.14,P <.001),以及报告能够读取价格的比例也更高(χ(2) 2 = 227.6,P <.001)。将仅众包照片评估的结果与传统现场调查数据进行比较,发现两者具有极佳的对应性,敏感性分析产生的ROC曲线下面积平均超过0.95,平均需要10至15名众包评分者才能达到超过0.90的值。
目前正在对这些工具和流程进行进一步测试和改进。这包括进行众包照片注释的系统评估,并系统地评估评分者工作的质量。
总体而言,众包技术与分层数据流以及提高注释质量的工具相结合,代表了照片注释问题的突破性解决方案,极大地扩展了以先前难以想象的规模使用富含公共卫生和其他数据的照片的机会。