Suppr超能文献

基于机器学习的长江口溶解氧预测建模与评估

[Machine Learning-based Dissolved Oxygen Prediction Modeling and Evaluation in the Yangtze River Estuary].

作者信息

Li Xiao-Ying, Wang Hua, Wang Yi-Qing, Zhang Liang-Jing, Wu Yi

机构信息

College of Environment, Hohai University, Nanjing 210098, China.

Key Laboratory of Integrated Regulation and Resource Development on Shallow Lake of Ministry of Education, Hohai University, Nanjing 210098, China.

出版信息

Huan Jing Ke Xue. 2024 Dec 8;45(12):7123-7133. doi: 10.13227/j.hjkx.202312111.

Abstract

Dissolved oxygen (DO) serves as a pivotal indicator, mirroring the intrinsic self-purification capacity of aquatic ecosystems and the overarching quality of the water environment. In the context of the Yangtze Estuary, a crucial hub for biodiversity and economic activities in China, understanding and forecasting levels of DO is instrumental for effective environmental stewardship and management strategies. Considering this, the introduction of sophisticated machine learning algorithms into the monitoring and predictive analytics of dissolved oxygen levels represents an important stride toward leveraging the power of data-driven insights for environmental sustainability. The Yangtze Estuary, characterized by its dynamic and complex hydrological and ecological systems, demands an insightful and nuanced approach to monitoring water quality parameters. To this end, six key monitoring stations were chosen across the estuary, including Xuliujing, Nantong Port, Qidong Port, Qinglong Port, South Port, and North Port, acting as sentinel sites for gauging the health of the water body. Leveraging three cutting-edge modeling techniques-particle swarm optimization-support vector regression (PSO-SVR), artificial neural network (ANN), and random forest (RF)-the research unraveled and forecasted the patterns of dissolved oxygen levels using monthly average water quality data spanning from 2004 to 2020. These models embodied the forefront of machine learning technology, each bringing distinct analytical strengths and perspectives to the table, from the nuanced, non-linear pattern recognition capabilities of ANN to the robustness and interpretability of RF. The meticulous evaluation conducted via the RF model underscored the paramount importance of three water quality variables, namely temperature, five-day biochemical oxygen demand, and ammonia nitrogen, in influencing the spatial-temporal dynamics of dissolved oxygen in the estuary. Comparative analysis of the prediction results yielded by the PSO-SVR, ANN, and RF models illuminated the superior performance of the RF model across the six monitoring stations, with an overall average error margin of 0.19, a testament to its efficacy and reliability. In comparison, the PSO-SVR and ANN models exhibited higher error rates of 0.38 and 0.47, respectively, albeit still contributing valuable insights into the complex dissolved oxygen dynamics in the Yangtze Estuary. The prediction performance of the machine learning models was evaluated, and the overall prediction performance ranking on the training set was RF (=0.971; RMSE=0.341 mg·L) > PSO-SVR (=0.884; RMSE=0.707 mg·L) > ANN (=0.792; RMSE=0.967 mg·L). The overall prediction performance ranking on the test set was RF ( = 0.986; RMSE=0.165 mg·L) > PSO-SVR (=0.951; RMSE=0.332 mg·L) > ANN (=0.800; RMSE=0.633 mg·L). Therefore, the RF model exhibited the best predictive ability on all monitoring sections, showing excellent performance and generalization ability both on the training and the test sets. The PSO-SVR model also performed well on most monitored profiles, with slightly lower predictive performance than that of the RF model though with better stability and generalization ability. However, the ANN model did not perform as perfectly as the other two models in some monitoring profiles and its network structure or parameters may need to be further optimized to improve the prediction accuracy and stability.

摘要

溶解氧(DO)是一个关键指标,反映了水生生态系统的内在自我净化能力以及水环境的总体质量。在长江口这个中国生物多样性和经济活动的重要枢纽背景下,了解和预测溶解氧水平对于有效的环境管理和策略至关重要。考虑到这一点,将先进的机器学习算法引入溶解氧水平的监测和预测分析,是朝着利用数据驱动的见解推动环境可持续性迈出的重要一步。长江口以其动态复杂的水文和生态系统为特征,需要一种有洞察力且细致入微的方法来监测水质参数。为此,在河口选择了六个关键监测站,包括徐六泾、南通港、启东港、青龙港、南港和北港,作为衡量水体健康状况的哨兵站点。利用三种前沿建模技术——粒子群优化支持向量回归(PSO-SVR)、人工神经网络(ANN)和随机森林(RF)——该研究利用2004年至2020年的月平均水质数据,揭示并预测了溶解氧水平的模式。这些模型代表了机器学习技术的前沿,每种模型都带来了独特的分析优势和视角,从人工神经网络细微的非线性模式识别能力到随机森林的稳健性和可解释性。通过随机森林模型进行的细致评估强调了三个水质变量,即温度、五日生化需氧量和氨氮,在影响河口溶解氧时空动态方面的至关重要性。对PSO-SVR、ANN和RF模型得出的预测结果进行的比较分析表明,随机森林模型在六个监测站中表现最优,总体平均误差幅度为0.19,证明了其有效性和可靠性。相比之下,PSO-SVR和ANN模型的错误率分别较高,为0.38和0.47,尽管它们仍为长江口复杂的溶解氧动态提供了有价值的见解。对机器学习模型的预测性能进行了评估,训练集上的总体预测性能排名为RF(=0.971;RMSE=0.341mg·L)>PSO-SVR(=0.884;RMSE=0.707mg·L)>ANN(=0.792;RMSE=0.967mg·L)。测试集上的总体预测性能排名为RF(=0.986;RMSE=0.165mg·L)>PSO-SVR(=0.951;RMSE=0.332mg·L)>ANN(=0.800;RMSE=0.633mg·L)。因此,随机森林模型在所有监测断面都表现出最佳的预测能力,在训练集和测试集上均表现出优异的性能和泛化能力。PSO-SVR模型在大多数监测断面也表现良好,虽然预测性能略低于随机森林模型,但具有更好的稳定性和泛化能力。然而,人工神经网络模型在一些监测断面的表现不如其他两个模型完美,其网络结构或参数可能需要进一步优化,以提高预测准确性和稳定性。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验