• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

美国居民家庭网购需求建模:一种机器学习方法及2009年至2017年间的比较研究

Modeling household online shopping demand in the U.S.: a machine learning approach and comparative investigation between 2009 and 2017.

作者信息

Barua Limon, Zou Bo, Zhou Yan, Liu Yulin

机构信息

Department of Civil, Materials, and Environmental Engineering, University of Illinois Chicago, Chicago, USA.

Department of Civil and Environmental Engineering, University of California, Berkeley, USA.

出版信息

Transportation (Amst). 2023;50(2):437-476. doi: 10.1007/s11116-021-10250-z. Epub 2021 Dec 2.

DOI:10.1007/s11116-021-10250-z
PMID:34873350
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8637526/
Abstract

Despite the rapid growth of online shopping and research interest in the relationship between online and in-store shopping, national-level modeling and investigation of the demand for online shopping with a prediction focus remain limited in the literature. This paper differs from prior work and leverages two recent releases of the U.S. National Household Travel Survey (NHTS) data for 2009 and 2017 to develop machine learning (ML) models, specifically gradient boosting machine (GBM), for predicting household-level online shopping purchases. The NHTS data allow for not only conducting nationwide investigation but also at the level of households, which is more appropriate than at the individual level given the connected consumption and shopping needs of members in a household. We follow a systematic procedure for model development including employing Recursive Feature Elimination algorithm to select input variables (features) in order to reduce the risk of model overfitting and increase model explainability. Among several ML models, GBM is found to yield the best prediction accuracy. Extensive post-modeling investigation is conducted in a comparative manner between 2009 and 2017, including quantifying the importance of each input variable in predicting online shopping demand, and characterizing value-dependent relationships between demand and the input variables. In doing so, two latest advances in machine learning techniques, namely Shapley value-based feature importance and Accumulated Local Effects plots, are adopted to overcome inherent drawbacks of the popular techniques in current ML modeling. The modeling and investigation are performed at the national level, with a number of findings obtained. The models developed and insights gained can be used for online shopping-related freight demand generation and may also be considered for evaluating the potential impact of relevant policies on online shopping demand.

摘要

尽管在线购物迅速增长,且学术界对线上与线下购物之间的关系兴趣浓厚,但在文献中,以预测为重点的国家级在线购物需求建模与调查仍然有限。本文与先前的研究不同,利用美国国家家庭旅行调查(NHTS)2009年和2017年的两个最新版本数据,开发机器学习(ML)模型,特别是梯度提升机(GBM),用于预测家庭层面的在线购物支出。NHTS数据不仅允许进行全国范围的调查,还能在家庭层面进行调查,鉴于家庭中成员的关联消费和购物需求,这比在个人层面进行调查更为合适。我们遵循系统的模型开发程序,包括采用递归特征消除算法来选择输入变量(特征),以降低模型过拟合的风险并提高模型的可解释性。在多个ML模型中,发现GBM具有最佳的预测准确性。以比较的方式在2009年和2017年之间进行了广泛的建模后调查,包括量化每个输入变量在预测在线购物需求中的重要性,以及刻画需求与输入变量之间的价值依赖关系。在此过程中,采用了机器学习技术的两项最新进展,即基于Shapley值的特征重要性和累积局部效应图,以克服当前ML建模中常用技术的固有缺陷。建模和调查在国家层面进行,并获得了一些研究结果。所开发的模型和获得的见解可用于生成与在线购物相关的货运需求,也可用于评估相关政策对在线购物需求的潜在影响。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/74cb/8637526/bbd5b5833fd1/11116_2021_10250_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/74cb/8637526/f1d154b65c3e/11116_2021_10250_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/74cb/8637526/d040fbd7de3d/11116_2021_10250_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/74cb/8637526/5aabf5fd0c9d/11116_2021_10250_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/74cb/8637526/e1c9f05822f9/11116_2021_10250_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/74cb/8637526/3789857420f3/11116_2021_10250_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/74cb/8637526/8edb2a9c98fd/11116_2021_10250_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/74cb/8637526/bbd5b5833fd1/11116_2021_10250_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/74cb/8637526/f1d154b65c3e/11116_2021_10250_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/74cb/8637526/d040fbd7de3d/11116_2021_10250_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/74cb/8637526/5aabf5fd0c9d/11116_2021_10250_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/74cb/8637526/e1c9f05822f9/11116_2021_10250_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/74cb/8637526/3789857420f3/11116_2021_10250_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/74cb/8637526/8edb2a9c98fd/11116_2021_10250_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/74cb/8637526/bbd5b5833fd1/11116_2021_10250_Fig7_HTML.jpg

相似文献

1
Modeling household online shopping demand in the U.S.: a machine learning approach and comparative investigation between 2009 and 2017.美国居民家庭网购需求建模:一种机器学习方法及2009年至2017年间的比较研究
Transportation (Amst). 2023;50(2):437-476. doi: 10.1007/s11116-021-10250-z. Epub 2021 Dec 2.
2
An empirical analysis of post-work grocery shopping activity duration using modified accelerated failure time model to differentiate time-dependent and time-independent covariates.利用修正的加速失效时间模型对工作后杂货店购物活动持续时间进行实证分析,以区分时依和非时依协变量。
PLoS One. 2018 Nov 21;13(11):e0207810. doi: 10.1371/journal.pone.0207810. eCollection 2018.
3
Analysis and modeling of changes in online shopping behavior due to Covid-19 pandemic: A Florida case study.新冠疫情导致的在线购物行为变化分析与建模:佛罗里达案例研究
Transp Policy (Oxf). 2022 Sep;126:162-176. doi: 10.1016/j.tranpol.2022.07.003. Epub 2022 Jul 18.
4
Can Predictive Modeling Tools Identify Patients at High Risk of Prolonged Opioid Use After ACL Reconstruction?预测模型工具能否识别 ACL 重建术后阿片类药物使用时间延长的高风险患者?
Clin Orthop Relat Res. 2020 Jul;478(7):0-1618. doi: 10.1097/CORR.0000000000001251.
5
Impacts of teleworking and online shopping on travel: a tour-based analysis.远程办公和网购对出行的影响:基于出行链的分析
Transportation (Amst). 2022 Aug 24:1-29. doi: 10.1007/s11116-022-10321-9.
6
A hazard-based approach to modelling the effects of online shopping on intershopping duration.一种基于风险的方法来模拟网络购物对购物间隔时长的影响。
Transportation (Amst). 2018;45(2):415-428. doi: 10.1007/s11116-017-9838-3. Epub 2017 Nov 21.
7
Prediction model of obstructive sleep apnea-related hypertension: Machine learning-based development and interpretation study.阻塞性睡眠呼吸暂停相关性高血压的预测模型:基于机器学习的开发与解读研究
Front Cardiovasc Med. 2022 Dec 5;9:1042996. doi: 10.3389/fcvm.2022.1042996. eCollection 2022.
8
Data-driven modeling and prediction of blood glucose dynamics: Machine learning applications in type 1 diabetes.基于数据驱动的血糖动力学建模与预测:机器学习在 1 型糖尿病中的应用。
Artif Intell Med. 2019 Jul;98:109-134. doi: 10.1016/j.artmed.2019.07.007. Epub 2019 Jul 26.
9
The concept of buying-shopping disorder: Comparing latent classes with a diagnostic approach for in-store and online shopping in a representative sample in Switzerland.购买-购物障碍的概念:在瑞士代表性样本中,通过对店内和在线购物的诊断方法比较潜在类别。
J Behav Addict. 2020 Sep 11;9(3):808-817. doi: 10.1556/2006.2020.00051. Print 2020 Oct 12.
10
Is online shopping addiction still a depressive illness? -- the induced consumption and traffic trap in live E-commerce.网购成瘾仍是一种抑郁症吗?——直播电商中的诱导消费与流量陷阱。
Heliyon. 2024 Apr 21;10(9):e29895. doi: 10.1016/j.heliyon.2024.e29895. eCollection 2024 May 15.

本文引用的文献

1
Selecting the most important self-assessed features for predicting conversion to mild cognitive impairment with random forest and permutation-based methods.采用随机森林和基于置换的方法选择预测向轻度认知障碍转化的最重要的自我评估特征。
Sci Rep. 2020 Nov 26;10(1):20630. doi: 10.1038/s41598-020-77296-4.
2
Satellite-based ground PM estimation using a gradient boosting decision tree.基于卫星的地面 PM 估算:梯度提升决策树方法。
Chemosphere. 2021 Apr;268:128801. doi: 10.1016/j.chemosphere.2020.128801. Epub 2020 Oct 29.
3
CAUSAL INTERPRETATIONS OF BLACK-BOX MODELS.
黑箱模型的因果解释
J Bus Econ Stat. 2019;2019. doi: 10.1080/07350015.2019.1624293. Epub 2019 Jul 5.
4
The distribution network of Amazon and the footprint of freight digitalization.亚马逊的配送网络与货运数字化足迹。
J Transp Geogr. 2020 Oct;88:102825. doi: 10.1016/j.jtrangeo.2020.102825. Epub 2020 Aug 12.
5
Gradient boosting machines, a tutorial.梯度提升机,教程。
Front Neurorobot. 2013 Dec 4;7:21. doi: 10.3389/fnbot.2013.00021. eCollection 2013.
6
A comparison of random forests, boosting and support vector machines for genomic selection.随机森林、提升算法和支持向量机在基因组选择中的比较
BMC Proc. 2011 May 27;5 Suppl 3(Suppl 3):S11. doi: 10.1186/1753-6561-5-S3-S11.
7
Neighbourhood food environment and area deprivation: spatial accessibility to grocery stores selling fresh fruit and vegetables in urban and rural settings.社区食物环境与地区贫困:城市和农村地区新鲜水果和蔬菜销售杂货店的空间可达性。
Int J Epidemiol. 2010 Feb;39(1):277-84. doi: 10.1093/ije/dyp221. Epub 2009 Jun 2.
8
A working guide to boosted regression trees.提升回归树实用指南。
J Anim Ecol. 2008 Jul;77(4):802-13. doi: 10.1111/j.1365-2656.2008.01390.x. Epub 2008 Apr 8.