Suppr超能文献

在数据中毒攻击下创建一个无偏差的食品配送应用评论数据集。

Creating a bias-free dataset with food delivery app reviews under data poisoning attack.

作者信息

Lee Hyunmin, Oh SeungYoung, Han JinHyun, Jung Hyunggu

机构信息

Department of Computer Science and Engineering, University of Seoul, Seoul 02504 Republic of Korea.

出版信息

Data Brief. 2024 Jun 5;55:110598. doi: 10.1016/j.dib.2024.110598. eCollection 2024 Aug.

Abstract

In online food delivery apps, customers write reviews to reflect their experiences. However, certain restaurants use a "review event" strategy to solicit favorable reviews from customers and boost their revenue. Review event is a marketing strategy where a restaurant owner gives free services to customers in return for a promise to write a review. Nevertheless, current datasets of app reviews for food delivery services neglect this situation. Furthermore, there appears to be an absence of datasets with reviews written in Korean. To solve this gap, this paper presents a dataset that contains reviews obtained from restaurants on a Korean app which use a review event strategy. A total of 128,668 reviews were gathered from 136 restaurants through crawling reviews using the Selenium library in Python. The dataset consists of detailed information of each review which contains information about ordered dishes, each review's written time, whether the food image is included in the review or not, and various star ratings such as total, taste, quantity, and delivery ratings. This dataset supports an innovative process of preparing AI training data for achieving fairness AI by proposing a bias-free dataset of food delivery app reviews with data poisoning attacks as an example.Additionally, the dataset is beneficial for researchers who are examining review events or analyzing the sentiment of food delivery app reviews.

摘要

在在线食品配送应用程序中,顾客会撰写评论来反映他们的体验。然而,某些餐厅采用“评论活动”策略,向顾客征求好评以增加收入。评论活动是一种营销策略,餐厅老板向顾客提供免费服务,以换取顾客撰写评论的承诺。然而,当前食品配送服务应用程序评论的数据集忽略了这种情况。此外,似乎缺少用韩语撰写的评论数据集。为了解决这一差距,本文展示了一个数据集,该数据集包含从韩国一款应用程序上采用评论活动策略的餐厅获取的评论。通过使用Python中的Selenium库爬取评论,从136家餐厅总共收集了128,668条评论。该数据集包含每条评论的详细信息,包括所点菜品的信息、每条评论的撰写时间、评论中是否包含食物图片,以及各种星级评分,如总分、口味、份量和配送评分。以数据中毒攻击为例,该数据集通过提出一个无偏差的食品配送应用程序评论数据集,支持了一个为实现公平人工智能准备人工智能训练数据的创新过程。此外,该数据集对研究评论活动或分析食品配送应用程序评论情感的研究人员有益。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2c5c/11226804/3566f2cf20e8/gr2.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验