Center for Computational and Data-Intensive Science and Engineering, Skolkovo Institute of Science and Technology, Moscow, Russian Federation, 121205.
Digital Agriculture Laboratory, Skolkovo Institute of Science and Technology, Moscow, Russian Federation, 121205.
Sci Rep. 2021 Dec 10;11(1):23822. doi: 10.1038/s41598-021-02564-w.
Natural environments are recognized as complex heterogeneous structures thus requiring numerous multi-scale observations to yield a comprehensive description. To monitor the current state and identify negative impacts of human activity, fast and precise instruments are in urgent need. This work provides an automated approach to the assessment of spatial variability of water quality using guideline values on the example of 1526 water samples comprising 21 parameters at 448 unique locations across the New Moscow region (Russia). We apply multi-task Gaussian process regression (GPR) to model the measured water properties across the territory, considering not only the spatial but inter-parameter correlations. GPR is enhanced with a Spectral Mixture Kernel to facilitate a hyper-parameter selection and optimization. We use a 5-fold cross-validation scheme along with [Formula: see text]-score to validate the results and select the best model for simultaneous prediction of water properties across the area. Finally, we develop a novel Probabilistic Substance Quality Index (PSQI) that combines probabilistic model predictions with the regulatory standards on the example of the epidemiological rules and hygienic regulations established in Russia. Moreover, we provide an interactive map of experimental results at 100 m resolution. The proposed approach contributes significantly to the development of flexible tools in environment quality monitoring, being scalable to different standard systems, number of observation points, and region of interest. It has a strong potential for adaption to environmental and policy changes and non-unified assessment conditions, and may be integrated into support-decision systems for the rapid estimation of water quality spatial distribution.
自然环境被认为是复杂的非均质结构,因此需要进行大量多尺度观测才能得出全面的描述。为了监测当前的状态和识别人类活动的负面影响,急需快速和精确的仪器。本工作提供了一种自动评估水质空间变异性的方法,以新莫斯科地区(俄罗斯) 448 个独特位置的 21 个参数的 1526 个水样为例,使用导则值。我们应用多任务高斯过程回归(GPR)来模拟整个区域的实测水质,不仅考虑了空间相关性,还考虑了参数间的相关性。GPR 通过光谱混合核进行了增强,以方便超参数的选择和优化。我们使用 5 折交叉验证方案和[Formula: see text]分数来验证结果,并为该地区的水质同时预测选择最佳模型。最后,我们开发了一种新的概率物质质量指数(PSQI),该指数结合了概率模型预测和俄罗斯制定的流行病学规则和卫生法规的监管标准。此外,我们还提供了 100 m 分辨率的实验结果交互式地图。该方法对环境质量监测中灵活工具的开发有很大贡献,可以扩展到不同的标准系统、观测点数量和感兴趣区域。它具有适应环境和政策变化以及非统一评估条件的强大潜力,并且可以集成到支持决策系统中,以快速估计水质的空间分布。