• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于机器学习的作物产量预测模型的三种特征降维技术评估。

Evaluation of Three Feature Dimension Reduction Techniques for Machine Learning-Based Crop Yield Prediction Models.

机构信息

School of Earth and Planetary Sciences, Spatial Sciences Discipline, Curtin University, Perth 6102, Australia.

Faculty of Surveying, Mapping and Geographic Information, Hanoi University of Natural Resources and Environment, Hanoi 100000, Vietnam.

出版信息

Sensors (Basel). 2022 Sep 1;22(17):6609. doi: 10.3390/s22176609.

DOI:10.3390/s22176609
PMID:36081066
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9460661/
Abstract

Machine learning (ML) has been widely used worldwide to develop crop yield forecasting models. However, it is still challenging to identify the most critical features from a dataset. Although either feature selection (FS) or feature extraction (FX) techniques have been employed, no research compares their performances and, more importantly, the benefits of combining both methods. Therefore, this paper proposes a framework that uses non-feature reduction (All-F) as a baseline to investigate the performance of FS, FX, and a combination of both (FSX). The case study employs the vegetation condition index (VCI)/temperature condition index (TCI) to develop 21 rice yield forecasting models for eight sub-regions in Vietnam based on ML methods, namely linear, support vector machine (SVM), decision tree (Tree), artificial neural network (ANN), and Ensemble. The results reveal that FSX takes full advantage of the FS and FX, leading FSX-based models to perform the best in 18 out of 21 models, while 2 (1) for FS-based (FX-based) models. These FXS-, FS-, and FX-based models improve All-F-based models at an average level of 21% and up to 60% in terms of RMSE. Furthermore, 21 of the best models are developed based on Ensemble (13 models), Tree (6 models), linear (1 model), and ANN (1 model). These findings highlight the significant role of FS, FX, and specially FSX coupled with a wide range of ML algorithms (especially Ensemble) for enhancing the accuracy of predicting crop yield.

摘要

机器学习(ML)已在全球范围内广泛用于开发作物产量预测模型。然而,从数据集中识别最关键的特征仍然具有挑战性。尽管已经采用了特征选择(FS)或特征提取(FX)技术,但没有研究比较它们的性能,更重要的是,没有研究组合这两种方法的好处。因此,本文提出了一个框架,该框架使用非特征减少(All-F)作为基准来研究 FS、FX 以及两者组合(FSX)的性能。该案例研究使用植被状况指数(VCI)/温度状况指数(TCI),基于 ML 方法为越南的 8 个次区域开发了 21 个水稻产量预测模型,包括线性、支持向量机(SVM)、决策树(Tree)、人工神经网络(ANN)和集成模型。结果表明,FSX 充分利用了 FS 和 FX,使得基于 FSX 的模型在 21 个模型中的 18 个中表现最佳,而基于 FS 的模型(基于 FX 的模型)有 2(1)个。这些基于 FXS、FS 和 FX 的模型在平均水平上提高了 All-F 模型的性能,在 RMSE 方面提高了 21%至 60%。此外,还基于 Ensemble(13 个模型)、Tree(6 个模型)、linear(1 个模型)和 ANN(1 个模型)开发了 21 个最佳模型。这些发现强调了 FS、FX 以及特别是 FSX 与广泛的 ML 算法(特别是 Ensemble)相结合,对于提高作物产量预测的准确性具有重要作用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4dd7/9460661/161643f332b2/sensors-22-06609-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4dd7/9460661/318b0bb66a40/sensors-22-06609-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4dd7/9460661/969f31954e96/sensors-22-06609-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4dd7/9460661/202b558c4ee7/sensors-22-06609-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4dd7/9460661/d91cb878af53/sensors-22-06609-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4dd7/9460661/161643f332b2/sensors-22-06609-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4dd7/9460661/318b0bb66a40/sensors-22-06609-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4dd7/9460661/969f31954e96/sensors-22-06609-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4dd7/9460661/202b558c4ee7/sensors-22-06609-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4dd7/9460661/d91cb878af53/sensors-22-06609-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4dd7/9460661/161643f332b2/sensors-22-06609-g005.jpg

相似文献

1
Evaluation of Three Feature Dimension Reduction Techniques for Machine Learning-Based Crop Yield Prediction Models.基于机器学习的作物产量预测模型的三种特征降维技术评估。
Sensors (Basel). 2022 Sep 1;22(17):6609. doi: 10.3390/s22176609.
2
Enhancing Crop Yield Prediction Utilizing Machine Learning on Satellite-Based Vegetation Health Indices.利用基于卫星的植被健康指数的机器学习提高作物产量预测。
Sensors (Basel). 2022 Jan 18;22(3):719. doi: 10.3390/s22030719.
3
Prediction of mustard yield using different machine learning techniques: a case study of Rajasthan, India.使用不同机器学习技术预测芥菜产量:以印度拉贾斯坦邦为例
Int J Biometeorol. 2023 Mar;67(3):539-551. doi: 10.1007/s00484-023-02434-2. Epub 2023 Jan 31.
4
Machine learning ensembles, neural network, hybrid and sparse regression approaches for weather based rainfed cotton yield forecast.基于机器学习集成、神经网络、混合和稀疏回归方法的天气预测雨养棉花产量。
Int J Biometeorol. 2024 Jun;68(6):1179-1197. doi: 10.1007/s00484-024-02661-1. Epub 2024 Apr 27.
5
Enhancing the security of patients' portals and websites by detecting malicious web crawlers using machine learning techniques.利用机器学习技术检测恶意网络爬虫,增强患者门户和网站的安全性。
Int J Med Inform. 2019 Dec;132:103976. doi: 10.1016/j.ijmedinf.2019.103976. Epub 2019 Sep 25.
6
Identification of clinical factors related to prediction of alcohol use disorder from electronic health records using feature selection methods.利用特征选择方法从电子健康记录中识别与预测酒精使用障碍相关的临床因素。
BMC Med Inform Decis Mak. 2022 Nov 23;22(1):304. doi: 10.1186/s12911-022-02051-w.
7
KFPredict: An ensemble learning prediction framework for diabetes based on fusion of key features.KFPredict:一种基于关键特征融合的糖尿病集成学习预测框架。
Comput Methods Programs Biomed. 2023 Apr;231:107378. doi: 10.1016/j.cmpb.2023.107378. Epub 2023 Jan 26.
8
Do we need different machine learning algorithms for QSAR modeling? A comprehensive assessment of 16 machine learning algorithms on 14 QSAR data sets.我们是否需要不同的机器学习算法来进行定量构效关系建模?对 16 种机器学习算法在 14 个定量构效关系数据集上的综合评估。
Brief Bioinform. 2021 Jul 20;22(4). doi: 10.1093/bib/bbaa321.
9
Machine-learning models for activity class prediction: A comparative study of feature selection and classification algorithms.机器学习模型在活动分类预测中的应用:特征选择与分类算法的对比研究。
Gait Posture. 2021 Sep;89:45-53. doi: 10.1016/j.gaitpost.2021.06.017. Epub 2021 Jun 24.
10
A GA-stacking ensemble approach for forecasting energy consumption in a smart household: A comparative study of ensemble methods.基于 GA 堆叠的智能家居能耗预测集成方法研究:集成方法比较
J Environ Manage. 2024 Jul;364:121264. doi: 10.1016/j.jenvman.2024.121264. Epub 2024 Jun 12.

引用本文的文献

1
Crop yield prediction in agriculture: A comprehensive review of machine learning and deep learning approaches, with insights for future research and sustainability.农业中的作物产量预测:机器学习和深度学习方法的全面综述,对未来研究和可持续性的见解
Heliyon. 2024 Nov 29;10(24):e40836. doi: 10.1016/j.heliyon.2024.e40836. eCollection 2024 Dec 30.

本文引用的文献

1
Winter wheat yield prediction using convolutional neural networks from environmental and phenological data.利用环境和物候数据的卷积神经网络进行冬小麦产量预测。
Sci Rep. 2022 Feb 25;12(1):3215. doi: 10.1038/s41598-022-06249-w.
2
Enhancing Crop Yield Prediction Utilizing Machine Learning on Satellite-Based Vegetation Health Indices.利用基于卫星的植被健康指数的机器学习提高作物产量预测。
Sensors (Basel). 2022 Jan 18;22(3):719. doi: 10.3390/s22030719.
3
A comparative study of machine learning methods for bio-oil yield prediction - A genetic algorithm-based features selection.
基于遗传算法的特征选择的生物油产率预测的机器学习方法比较研究
Bioresour Technol. 2021 Sep;335:125292. doi: 10.1016/j.biortech.2021.125292. Epub 2021 May 15.
4
A CNN-RNN Framework for Crop Yield Prediction.一种用于作物产量预测的卷积神经网络-循环神经网络框架。
Front Plant Sci. 2020 Jan 24;10:1750. doi: 10.3389/fpls.2019.01750. eCollection 2019.
5
Using recursive feature elimination in random forest to account for correlated variables in high dimensional data.在随机森林中使用递归特征消除来处理高维数据中的相关变量。
BMC Genet. 2018 Sep 17;19(Suppl 1):65. doi: 10.1186/s12863-018-0633-8.
6
Principal component analysis: a review and recent developments.主成分分析:综述与最新进展
Philos Trans A Math Phys Eng Sci. 2016 Apr 13;374(2065):20150202. doi: 10.1098/rsta.2015.0202.
7
Use of vegetation health data for estimation of aus rice yield in bangladesh.利用植被健康数据估算孟加拉国水稻产量。
Sensors (Basel). 2009;9(4):2968-75. doi: 10.3390/s90402968. Epub 2009 Apr 23.
8
What should be expected from feature selection in small-sample settings.在小样本情况下,特征选择应达到什么预期效果。
Bioinformatics. 2006 Oct 1;22(19):2430-6. doi: 10.1093/bioinformatics/btl407. Epub 2006 Jul 26.