Beijing Laboratory for Intelligent Environmental Protection, School of Artificial Intelligence, Beijing Technology and Business University, Beijing, China.
Beijing Institute of Fashion Technology, Beijing, China.
PLoS One. 2023 Nov 14;18(11):e0294278. doi: 10.1371/journal.pone.0294278. eCollection 2023.
As for the problem that the traditional single depth prediction model has poor strain capacity to the prediction results of time series data when predicting lake eutrophication, this study takes the multi-factor water quality data affecting lake eutrophication as the main research object. A deep reinforcement learning model is proposed, which can realize the mutual conversion of water quality data prediction models at different times, select the optimal prediction strategy of lake eutrophication at the current time according to its own continuous learning, and improve the reinforcement learning algorithm. Firstly, the greedy factor, the fixed parameter of Agent learning training in reinforcement learning, is introduced into an arctangent function and the mean value reward factor is defined. On this basis, three Q estimates are introduced, and the weight parameters are obtained by calculating the realistic value of Q, taking the average value and the minimum value to update the final Q table, so as to get an Improved MIMO-DD-3Q Learning model. The preliminary prediction results of lake eutrophication are obtained, and the errors obtained are used as the secondary input to continue updating the Q table to build the final Improved MIMO-DD-3Q Learning model, so as to achieve the final prediction of water eutrophication. In this study, multi-factor water quality data of Yongding River in Beijing were selected from 0:00 on July 26, 2021 to 0:00 on September 5, 2021. Firstly, data smoothing and principal component analysis were carried out to confirm that there was a certain correlation between all factors in the occurrence of lake eutrophication. Then, the Improved MIMO-DD-3Q Learning prediction model was used for experimental verification. The results show that the Improved MIMO-DD-3Q Learning model has a good effect in the field of lake eutrophication prediction.
针对传统的单一深度预测模型在预测时间序列数据的预测结果时对湖泊富营养化的应变能力较差的问题,本研究以影响湖泊富营养化的多因素水质数据为主要研究对象。提出了一种深度强化学习模型,可以实现不同时间水质数据预测模型的相互转换,根据自身的持续学习,选择当前时间湖泊富营养化的最优预测策略,并改进强化学习算法。首先,在强化学习中引入贪婪因子,即代理学习训练的固定参数,将其引入反正切函数,并定义均值奖励因子。在此基础上,引入三个 Q 估计值,并通过计算 Q 的真实值、平均值和最小值来获取权重参数,更新最终的 Q 表,得到改进的 MIMO-DD-3Q 学习模型。初步预测湖泊富营养化的结果,将得到的误差作为二次输入,继续更新 Q 表,构建最终的改进的 MIMO-DD-3Q 学习模型,从而实现水富营养化的最终预测。本研究以 2021 年 7 月 26 日 0:00 至 2021 年 9 月 5 日 0:00 期间北京永定河的多因素水质数据为例,首先对数据进行平滑处理和主成分分析,确认在湖泊富营养化发生过程中所有因素之间存在一定的相关性。然后,采用改进的 MIMO-DD-3Q 学习预测模型进行实验验证。结果表明,改进的 MIMO-DD-3Q 学习模型在湖泊富营养化预测领域具有良好的效果。