Suppr超能文献

通过亚马逊土耳其机器人众包食品购买收据注释:一项可行性研究。

Crowdsourcing for Food Purchase Receipt Annotation via Amazon Mechanical Turk: A Feasibility Study.

作者信息

Lu Wenhua, Guttentag Alexandra, Elbel Brian, Kiszko Kamila, Abrams Courtney, Kirchner Thomas R

机构信息

Department of Childhood Studies, Rutgers, The State University of New Jersey, Camden, NJ, United States.

College of Global Public Health, New York University, New York, NY, United States.

出版信息

J Med Internet Res. 2019 Apr 5;21(4):e12047. doi: 10.2196/12047.

Abstract

BACKGROUND

The decisions that individuals make about the food and beverage products they purchase and consume directly influence their energy intake and dietary quality and may lead to excess weight gain and obesity. However, gathering and interpreting data on food and beverage purchase patterns can be difficult. Leveraging novel sources of data on food and beverage purchase behavior can provide us with a more objective understanding of food consumption behaviors.

OBJECTIVE

Food and beverage purchase receipts often include time-stamped location information, which, when associated with product purchase details, can provide a useful behavioral measurement tool. The purpose of this study was to assess the feasibility, reliability, and validity of processing data from fast-food restaurant receipts using crowdsourcing via Amazon Mechanical Turk (MTurk).

METHODS

Between 2013 and 2014, receipts (N=12,165) from consumer purchases were collected at 60 different locations of five fast-food restaurant chains in New Jersey and New York City, USA (ie, Burger King, KFC, McDonald's, Subway, and Wendy's). Data containing the restaurant name, location, receipt ID, food items purchased, price, and other information were manually entered into an MS Access database and checked for accuracy by a second reviewer; this was considered the gold standard. To assess the feasibility of coding receipt data via MTurk, a prototype set of receipts (N=196) was selected. For each receipt, 5 turkers were asked to (1) identify the receipt identifier and the name of the restaurant and (2) indicate whether a beverage was listed in the receipt; if yes, they were to categorize the beverage as cold (eg, soda or energy drink) or hot (eg, coffee or tea). Interturker agreement for specific questions (eg, restaurant name and beverage inclusion) and agreement between turker consensus responses and the gold standard values in the manually entered dataset were calculated.

RESULTS

Among the 196 receipts completed by turkers, the interturker agreement was 100% (196/196) for restaurant names (eg, Burger King, McDonald's, and Subway), 98.5% (193/196) for beverage inclusion (ie, hot, cold, or none), 92.3% (181/196) for types of hot beverage (eg, hot coffee or hot tea), and 87.2% (171/196) for types of cold beverage (eg, Coke or bottled water). When compared with the gold standard data, the agreement level was 100% (196/196) for restaurant name, 99.5% (195/196) for beverage inclusion, and 99.5% (195/196) for beverage types.

CONCLUSIONS

Our findings indicated high interrater agreement for questions across difficulty levels (eg, single- vs binary- vs multiple-choice items). Compared with traditional methods for coding receipt data, MTurk can produce excellent-quality data in a lower-cost, more time-efficient manner.

摘要

背景

个人对所购买和消费的食品及饮料产品做出的决策,会直接影响其能量摄入和饮食质量,并可能导致体重过度增加和肥胖。然而,收集和解读有关食品及饮料购买模式的数据可能具有难度。利用食品及饮料购买行为的新数据来源,能让我们对食品消费行为有更客观的认识。

目的

食品及饮料购买收据通常包含带时间戳的位置信息,当与产品购买细节相关联时,可提供一种有用的行为测量工具。本研究的目的是评估通过亚马逊土耳其机器人(MTurk)众包处理快餐餐厅收据数据的可行性、可靠性和有效性。

方法

2013年至2014年期间,在美国新泽西州和纽约市的5家快餐连锁店(即汉堡王、肯德基、麦当劳、赛百味和温迪)的60个不同地点收集了消费者购买收据(N = 12,165)。包含餐厅名称、位置、收据ID、购买的食品项目、价格及其他信息的数据被手动录入到一个MS Access数据库中,并由第二位审核人员检查准确性;这被视为金标准。为评估通过MTurk对收据数据进行编码的可行性,选择了一组收据原型(N = 196)。对于每张收据,要求5名土耳其机器人(turker)(1)识别收据标识符和餐厅名称,以及(2)指出收据中是否列出了饮料;如果是,他们要将饮料分类为冷饮(如汽水或能量饮料)或热饮(如咖啡或茶)。计算了针对特定问题(如餐厅名称和饮料包含情况)的审核人员间一致性,以及审核人员一致意见回复与手动录入数据集中金标准值之间的一致性。

结果

在审核人员完成的196张收据中,餐厅名称(如汉堡王、麦当劳和赛百味)的审核人员间一致性为100%(196/196),饮料包含情况(即热饮、冷饮或无)的一致性为98.5%(193/196),热饮类型(如热咖啡或热茶)的一致性为92.3%(181/196),冷饮类型(如可乐或瓶装水)的一致性为87.2%(171/196)。与金标准数据相比,餐厅名称的一致性水平为100%(196/196),饮料包含情况的一致性为99.5%(195/196),饮料类型的一致性为99.5%(195/196)。

结论

我们的研究结果表明,对于不同难度水平的问题(如单项选择题、二元选择题和多项选择题),评分者间一致性较高。与传统的收据数据编码方法相比,MTurk能够以更低成本、更高效率的方式产生高质量的数据。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f9ad/6473207/aba0a0513962/jmir_v21i4e12047_fig1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验