State Environmental Protection Key Laboratory of Soil Environmental Management and Pollution Control, Nanjing Institute of Environmental Sciences, Ministry of Ecology and Environment of China, Nanjing 210042, China.
Int J Environ Res Public Health. 2022 Jul 30;19(15):9374. doi: 10.3390/ijerph19159374.
Chlorinated aliphatic hydrocarbons (CAHs) are widely used in agriculture and industries and have become one of the most common groundwater contaminations. With the excellent performance of the deep learning method in predicting, LSTM and XGBoost were used to forecast dichloroethene (DCE) concentrations in a pesticide-contaminated site undergoing natural attenuation. The input variables included BTEX, vinyl chloride (VC), and five water quality indicators. In this study, the predictive performances of long short-term memory (LSTM) and extreme gradient boosting (XGBoost) were compared, and the influences of variables on models' performances were evaluated. The results indicated XGBoost was more likely to capture DCE variation and was robust in high values, while the LSTM model presented better accuracy for all wells. The well with higher DCE concentrations would lower the model's accuracy, and its influence was more evident in XGBoost than LSTM. The explanation of the SHapley Additive exPlanations (SHAP) value of each variable indicated high consistency with the rules of biodegradation in the real environment. LSTM and XGBoost could predict DCE concentrations through only using water quality variables, and LSTM performed better than XGBoost.
氯代脂肪烃(CAHs)在农业和工业中被广泛应用,已成为最常见的地下水污染物之一。由于深度学习方法在预测方面具有出色的性能,因此本研究使用长短期记忆网络(LSTM)和极端梯度提升(XGBoost)来预测正在进行自然衰减的农药污染场地中的二氯乙烯(DCE)浓度。输入变量包括 BTEX、氯乙烯(VC)和五个水质指标。在本研究中,比较了长短期记忆网络(LSTM)和极端梯度提升(XGBoost)的预测性能,并评估了变量对模型性能的影响。结果表明,XGBoost 更有可能捕捉 DCE 的变化,并且在高值时表现稳健,而 LSTM 模型对所有井的精度都更高。DCE 浓度较高的井会降低模型的精度,而且这种影响在 XGBoost 中比在 LSTM 中更为明显。每个变量的 SHapley Additive exPlanations(SHAP)值的解释与实际环境中的生物降解规律高度一致。LSTM 和 XGBoost 可以仅使用水质变量来预测 DCE 浓度,并且 LSTM 的性能优于 XGBoost。