Kim Yeonuk, Garcia Monica, Andrew Black T, Johnson Mark S
Institute for Resources, Environment and Sustainability, University of British Columbia, Vancouver, Canada.
Division of Hydrologic Sciences, Desert Research Institute, Las Vegas, Nevada, United States of America.
PLoS One. 2025 Jul 23;20(7):e0328798. doi: 10.1371/journal.pone.0328798. eCollection 2025.
Physics-informed machine learning techniques have emerged to tackle challenges inherent in pure machine learning (ML) approaches. One such technique, the hybrid approach, has been introduced to estimate terrestrial evapotranspiration (ET), a crucial variable linking water, energy, and carbon cycles. A key advantage of these hybrid ET models is their improved performance, particularly under extreme conditions, compared to ET estimates relying solely on ML. However, the mechanisms driving their improved performance are not well understood. To address this gap, we developed six hybrid approaches based on different physical formulations of ET and compared them with a pure ML model. All models employed the random forest algorithm and were trained on daily-scale ET observations, in-situ meteorological data and satellite remote sensing. We found a strong correlation (r = 0.93) between the sensitivity of ET estimates to machine-learned parameters and model error (root-mean-square error; RMSE), indicating that reduced sensitivity minimizes error propagation and improves performance. Notably, the most accurate hybrid model (RMSE = 17.8 W m-2 in energy unit) utilized a novel empirical parameter, which is relatively stable due to land-atmosphere equilibrium, outperforming both the pure ML model and hybrid models requiring conventional parameters (e.g., surface conductance). These results imply that conventional parameterizations may require reevaluated to effectively integrate physical models with machine learning, as conventional choices may not be optimal for this new, hybrid, paradigm. This study underscores the critical role of domain knowledge in setting up hybrid models, potentially guiding future hybrid model developments beyond ET estimation.
基于物理知识的机器学习技术已经出现,以应对纯机器学习(ML)方法中固有的挑战。其中一种技术,即混合方法,已被引入用于估算陆地蒸散(ET),这是一个连接水、能量和碳循环的关键变量。与仅依赖机器学习的ET估算相比,这些混合ET模型的一个关键优势是其性能得到了改善,特别是在极端条件下。然而,驱动其性能提升的机制尚未得到很好的理解。为了弥补这一差距,我们基于ET的不同物理公式开发了六种混合方法,并将它们与纯ML模型进行了比较。所有模型都采用了随机森林算法,并根据日尺度的ET观测、现场气象数据和卫星遥感进行了训练。我们发现ET估算对机器学习参数的敏感性与模型误差(均方根误差;RMSE)之间存在很强的相关性(r = 0.93),这表明敏感性降低可使误差传播最小化并提高性能。值得注意的是,最准确的混合模型(以能量单位计,RMSE = 17.8 W m-2)使用了一个新的经验参数,由于陆地-大气平衡,该参数相对稳定,优于纯ML模型和需要传统参数(如表面传导率)的混合模型。这些结果意味着可能需要重新评估传统参数化方法,以便有效地将物理模型与机器学习集成,因为传统选择可能不适用于这种新的混合范式。这项研究强调了领域知识在建立混合模型中的关键作用,可能会指导未来超越ET估算的混合模型开发。