Wang Xuan, Tian Wenchong, Liao Zhenliang
College of Environmental Science and Engineering, Tongji University, Shanghai, 200092, China.
College of Civil Engineering and Architecture, Xinjiang University, Urumqi, 830046, China.
Environ Sci Pollut Res Int. 2021 Feb 27. doi: 10.1007/s11356-021-13086-3.
The performance comparison studies of the autoregressive integrated moving average model (ARIMA) and the artificial neural network (ANN) were mostly carried out between the selected model structures through trial-and-error, strongly influenced by model structure uncertainty. This research aims to make up for this inadequacy. First, a surface water quality prediction case study including eight monitoring sites in China was introduced. Second, the ARIMA and ANN's performance was compared statistically between 6912 Seasonal ARIMA (SARIMA) and 110,592 feedforward ANN with different model structures, based on the mean square error (MSE) distributions depicted by boxplots. In a statistical view, the ANN models obtained a significantly lower median value and a more concentrated distribution of validation MSEs, which indicated lighter overfitting and better generalization ability. Furthermore, the optimal SARIMA models' performance is inferior to even the median of the ANN models in the case study. In contrast with the previous comparisons among selected models, the statistical comparison in this study shows lower uncertainty.
自回归积分移动平均模型(ARIMA)与人工神经网络(ANN)的性能比较研究大多是在选定的模型结构之间通过反复试验进行的,受模型结构不确定性的影响很大。本研究旨在弥补这一不足。首先,介绍了一个包含中国八个监测点的地表水水质预测案例研究。其次,基于箱线图所描绘的均方误差(MSE)分布,对6912个季节性自回归积分移动平均模型(SARIMA)和110592个具有不同模型结构的前馈神经网络进行了统计比较。从统计学角度来看,神经网络模型获得了显著更低的中位数以及验证MSE更为集中的分布,这表明过拟合较轻且泛化能力更好。此外,在案例研究中,最优SARIMA模型的性能甚至不如神经网络模型的中位数。与之前在选定模型之间的比较相比,本研究中的统计比较显示出更低的不确定性。