Suppr超能文献

一种用于短周期农产品销售预测的分层随机森林-极端梯度提升模型

A Hierarchical RF-XGBoost Model for Short-Cycle Agricultural Product Sales Forecasting.

作者信息

Li Jiawen, Lin Binfan, Wang Peixian, Chen Yanmei, Zeng Xianxian, Liu Xin, Chen Rongjun

机构信息

School of Computer Science, Guangdong Polytechnic Normal University, Guangzhou 510665, China.

Guangxi Key Lab of Multi-Source Information Mining & Security, Guangxi Normal University, Guilin 541004, China.

出版信息

Foods. 2024 Sep 17;13(18):2936. doi: 10.3390/foods13182936.

Abstract

Short-cycle agricultural product sales forecasting significantly reduces food waste by accurately predicting demand, ensuring producers match supply with consumer needs. However, the forecasting is often subject to uncertain factors, resulting in highly volatile and discontinuous data. To address this, a hierarchical prediction model that combines RF-XGBoost is proposed in this work. It adopts the Random Forest (RF) in the first layer to extract residuals and achieve initial prediction results based on correlation features from Grey Relation Analysis (GRA). Then, a new feature set based on residual clustering features is generated after the hierarchical clustering is applied to classify the characteristics of the residuals. Subsequently, Extreme Gradient Boosting (XGBoost) acts as the second layer that utilizes those residual clustering features to yield the prediction results. The final prediction is by incorporating the results from the first layer and second layer correspondingly. As for the performance evaluation, using agricultural product sales data from a supermarket in China from 1 July 2020 to 30 June 2023, the results demonstrate superiority over standalone RF and XGBoost, with a Mean Absolute Percentage Error (MAPE) reduction of 10% and 12%, respectively, and a coefficient of determination (R) increase of 22% and 24%, respectively. Additionally, its generalization is validated across 42 types of agricultural products from six vegetable categories, showing its extensive practical ability. Such performances reveal that the proposed model beneficially enhances the precision of short-term agricultural product sales forecasting, with the advantages of optimizing the supply chain from producers to consumers and minimizing food waste accordingly.

摘要

短周期农产品销售预测通过准确预测需求显著减少了食物浪费,确保生产者使供应与消费者需求相匹配。然而,该预测常常受到不确定因素的影响,导致数据高度波动且不连续。为了解决这个问题,本文提出了一种结合随机森林(RF)和极端梯度提升(XGBoost)的分层预测模型。它在第一层采用随机森林(RF)来提取残差,并基于灰色关联分析(GRA)的相关特征获得初始预测结果。然后,在对残差特征进行分层聚类后,生成基于残差聚类特征的新特征集。随后,极端梯度提升(XGBoost)作为第二层利用这些残差聚类特征得出预测结果。最终预测是通过相应地合并第一层和第二层的结果得到的。在性能评估方面,使用中国一家超市2020年7月1日至2023年6月30日的农产品销售数据,结果表明该模型优于单独的随机森林(RF)和极端梯度提升(XGBoost),平均绝对百分比误差(MAPE)分别降低了10%和12%,决定系数(R)分别提高了22%和24%。此外,其泛化能力在六个蔬菜类别的42种农产品上得到了验证,显示出其广泛的实际应用能力。这些性能表明,所提出的模型有益地提高了短期农产品销售预测的精度,具有优化从生产者到消费者的供应链并相应减少食物浪费的优点。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/348f/11431005/e5bc51685f4c/foods-13-02936-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验