Khaki Saeed, Wang Lizhi, Archontoulis Sotirios V
Industrial and Manufacturing Systems Engineering Department, Iowa State University, Ames, IA, United States.
Department of Agronomy, Iowa State University, Ames, IA, United States.
Front Plant Sci. 2020 Jan 24;10:1750. doi: 10.3389/fpls.2019.01750. eCollection 2019.
Crop yield prediction is extremely challenging due to its dependence on multiple factors such as crop genotype, environmental factors, management practices, and their interactions. This paper presents a deep learning framework using convolutional neural networks (CNNs) and recurrent neural networks (RNNs) for crop yield prediction based on environmental data and management practices. The proposed CNN-RNN model, along with other popular methods such as random forest (RF), deep fully connected neural networks (DFNN), and LASSO, was used to forecast corn and soybean yield across the entire Corn Belt (including 13 states) in the United States for years 2016, 2017, and 2018 using historical data. The new model achieved a root-mean-square-error (RMSE) 9% and 8% of their respective average yields, substantially outperforming all other methods that were tested. The CNN-RNN has three salient features that make it a potentially useful method for other crop yield prediction studies. (1) The CNN-RNN model was designed to capture the time dependencies of environmental factors and the genetic improvement of seeds over time without having their genotype information. (2) The model demonstrated the capability to generalize the yield prediction to untested environments without significant drop in the prediction accuracy. (3) Coupled with the backpropagation method, the model could reveal the extent to which weather conditions, accuracy of weather predictions, soil conditions, and management practices were able to explain the variation in the crop yields.
作物产量预测极具挑战性,因为它依赖于多种因素,如作物基因型、环境因素、管理措施及其相互作用。本文提出了一种基于环境数据和管理措施,使用卷积神经网络(CNN)和循环神经网络(RNN)进行作物产量预测的深度学习框架。所提出的CNN-RNN模型,与其他常用方法如随机森林(RF)、深度全连接神经网络(DFNN)和套索回归(LASSO)一起,利用历史数据对2016年、2017年和2018年美国整个玉米带(包括13个州)的玉米和大豆产量进行预测。新模型的均方根误差(RMSE)分别为各自平均产量的9%和8%,显著优于所有其他测试方法。CNN-RNN具有三个显著特点,使其成为其他作物产量预测研究中一种潜在有用的方法。(1)CNN-RNN模型旨在在不具备基因型信息的情况下,捕捉环境因素的时间依赖性以及种子随时间的遗传改良。(2)该模型展示了将产量预测推广到未测试环境的能力,且预测准确率不会显著下降。(3)结合反向传播方法,该模型能够揭示天气状况、天气预报准确性、土壤条件和管理措施对作物产量变化的解释程度。