The National Key Laboratory of Water Disaster Prevention, Hohai University, Nanjing 210098, China; College of Hydrology and Water Resources, Hohai University, Nanjing 210098, China.
The National Key Laboratory of Water Disaster Prevention, Hohai University, Nanjing 210098, China; College of Hydrology and Water Resources, Hohai University, Nanjing 210098, China.
J Environ Manage. 2024 Jun;362:121299. doi: 10.1016/j.jenvman.2024.121299. Epub 2024 Jun 3.
Hydrological forecasting is of great importance for water resources management and planning, especially given the increasing occurrence of extreme events such as floods and droughts. The physics-informed machine learning (PIML) models effectively integrate conceptual hydrologic models with machine learning (ML) models. In this process, the intermediate variables of PIML models serve as bridges between inputs and outputs, while the impact of intermediate variables on the performance of PIML models remains unclear. To fill this knowledge gap, this study aims to encompass the construction of PIML models based on various hydrologic models, conduct comparative analyses of different intermediate variables based on a case study of 205 CAMELS basins, and further explore the relationship between the performance of PIML models and catchment characteristics. The optimal ML model for constructing PIML is first selected among four ML models within the 205 basins. The PIML models are then developed based on five monthly water balance models, namely TM, XM, MEP, SLM, and TVGM. To quantify the potential impact of difference in intermediate variables, two sets of experiments are further designed and performed, namely S1 with actual evapotranspiration as the intermediate variable and S2 with soil moisture as the intermediate variable. Results show that five PIML models generally outperformed the optimal standalone ML models, i.e., the Lasso model. Specifically, regardless of the choice of intermediate variables, the PIML-XM model consistently outperformed the other models within the same basins. Almost all constructed PIML models are affected by the intermediate variables in monthly runoff simulations. Typically, S1 exhibited better performance compared to S2. A greater impact of aridity index, forest fraction, and catchment area on model performance is observed in S2. These findings improve our understanding of constructing PIML models in hydrology by emphasizing their excellent performance in runoff simulations and highlighting the importance of intermediate variables.
水文预报在水资源管理和规划中具有重要意义,特别是考虑到洪水和干旱等极端事件的发生越来越频繁。物理信息机器学习(PIML)模型有效地将概念性水文模型与机器学习(ML)模型相结合。在这个过程中,PIML 模型的中间变量充当输入和输出之间的桥梁,而中间变量对 PIML 模型性能的影响尚不清楚。为了填补这一知识空白,本研究旨在涵盖基于各种水文模型构建 PIML 模型,对基于 205 个 CAMELS 流域案例研究的不同中间变量进行比较分析,并进一步探讨 PIML 模型性能与流域特征之间的关系。首先在 205 个流域中的四个 ML 模型中选择构建 PIML 的最优 ML 模型。然后基于五个月水量平衡模型,即 TM、XM、MEP、SLM 和 TVGM,开发 PIML 模型。为了量化中间变量差异的潜在影响,进一步设计并进行了两组实验,即 S1 以实际蒸散量为中间变量,S2 以土壤湿度为中间变量。结果表明,五个 PIML 模型通常优于最优独立 ML 模型,即 Lasso 模型。具体来说,无论选择哪种中间变量,PIML-XM 模型在同一流域内的其他模型中始终表现最佳。几乎所有构建的 PIML 模型都受到月径流模拟中中间变量的影响。通常,S1 比 S2 表现更好。在 S2 中,干旱指数、森林比例和流域面积对模型性能的影响更大。这些发现通过强调 PIML 模型在径流模拟中的出色表现以及中间变量的重要性,提高了我们对水文领域构建 PIML 模型的理解。