Faculty of Social Sciences (Health Sciences), Tampere University, Tampere, Finland.
Department of Mathematics and Statistics, University of Turku, Turku, Finland.
J Med Internet Res. 2023 Dec 15;25:e44599. doi: 10.2196/44599.
BACKGROUND: Loyalty card data automatically collected by retailers provide an excellent source for evaluating health-related purchase behavior of customers. The data comprise information on every grocery purchase, including expenditures on product groups and the time of purchase for each customer. Such data where customers have an expenditure value for every product group for each time can be formulated as 3D tensorial data. OBJECTIVE: This study aimed to use the modern tensorial principal component analysis (PCA) method to uncover the characteristics of health-related purchase patterns from loyalty card data. Another aim was to identify card holders with distinct purchase patterns. We also considered the interpretation, advantages, and challenges of tensorial PCA compared with standard PCA. METHODS: Loyalty card program members from the largest retailer in Finland were invited to participate in this study. Our LoCard data consist of the purchases of 7251 card holders who consented to the use of their data from the year 2016. The purchases were reclassified into 55 product groups and aggregated across 52 weeks. The data were then analyzed using tensorial PCA, allowing us to effectively reduce the time and product group-wise dimensions simultaneously. The augmentation method was used for selecting the suitable number of principal components for the analysis. RESULTS: Using tensorial PCA, we were able to systematically search for typical food purchasing patterns across time and product groups as well as detect different purchasing behaviors across groups of card holders. For example, we identified customers who purchased large amounts of meat products and separated them further into groups based on time profiles, that is, customers whose purchases of meat remained stable, increased, or decreased throughout the year or varied between seasons of the year. CONCLUSIONS: Using tensorial PCA, we can effectively examine customers' purchasing behavior in more detail than with traditional methods because it can handle time and product group dimensions simultaneously. When interpreting the results, both time and product dimensions must be considered. In further analyses, these time and product groups can be directly associated with additional consumer characteristics such as socioeconomic and demographic predictors of dietary patterns. In addition, they can be linked to external factors that impact grocery purchases such as inflation and unexpected pandemics. This enables us to identify what types of people have specific purchasing patterns, which can help in the development of ways in which consumers can be steered toward making healthier food choices.
背景:零售商自动收集的会员卡数据为评估顾客的健康相关购买行为提供了极好的来源。这些数据包含了每位顾客每次杂货店购买的信息,包括产品组的支出和购买时间。对于每位顾客,每个产品组都有支出值的此类数据可以表示为 3D 张量数据。
目的:本研究旨在使用现代张量主成分分析(PCA)方法从会员卡数据中揭示与健康相关的购买模式特征。另一个目的是识别具有独特购买模式的会员卡持有者。我们还考虑了张量 PCA 与标准 PCA 的解释、优势和挑战。
方法:邀请芬兰最大零售商的会员卡计划成员参加这项研究。我们的 LoCard 数据包含 7251 位同意使用其 2016 年数据的会员卡持有者的购买记录。这些购买记录被重新分类为 55 个产品组,并在 52 周内汇总。然后,我们使用张量 PCA 对数据进行分析,从而能够有效地同时减少时间和产品组维度。使用增强法选择适合分析的主成分数量。
结果:使用张量 PCA,我们能够系统地搜索跨时间和产品组的典型食品购买模式,并检测不同会员卡持有者群体之间的不同购买行为。例如,我们识别出购买大量肉类产品的顾客,并根据时间分布将他们进一步分为不同的群体,即全年购买肉类产品保持稳定、增加或减少或每年的季节之间波动的顾客。
结论:使用张量 PCA,我们可以比传统方法更有效地检查顾客的购买行为,因为它可以同时处理时间和产品组维度。在解释结果时,必须同时考虑时间和产品维度。在进一步的分析中,可以将这些时间和产品组直接与其他消费者特征(如饮食模式的社会经济和人口统计学预测因子)相关联。此外,它们可以与影响杂货购买的外部因素(如通货膨胀和意外的大流行)相关联。这使我们能够识别具有特定购买模式的人群类型,这有助于开发引导消费者做出更健康食品选择的方法。
Digit Health. 2018-11-29
Drug Alcohol Depend. 2020-9-1
Public Health Nutr. 2022-11
BMC Public Health. 2019-6-20
Public Health Nutr. 2022-11
Drug Alcohol Depend. 2020-9-1