Centre for Data Science, Queensland University of Technology, Brisbane, Queensland, Australia.
School of Mathematical Sciences, Queensland University of Technology, Brisbane, Queensland, Australia.
F1000Res. 2024 Mar 28;12:991. doi: 10.12688/f1000research.136097.2. eCollection 2023.
Water is the lifeblood of river networks, and its quality plays a crucial role in sustaining both aquatic ecosystems and human societies. Real-time monitoring of water quality is increasingly reliant on in-situ sensor technology.Anomaly detection is crucial for identifying erroneous patterns in sensor data, but can be a challenging task due to the complexity and variability of the data, even under typical conditions. This paper presents a solution to the challenging task of anomaly detection for river network sensor data, which is essential for accurate and continuous monitoring.
We use a graph neural network model, the recently proposed Graph Deviation Network (GDN), which employs graph attention-based forecasting to capture the complex spatio-temporal relationships between sensors. We propose an alternate anomaly threshold criteria for the model, GDN+, based on the learned graph. To evaluate the model's efficacy, we introduce new benchmarking simulation experiments with highly-sophisticated dependency structures and subsequence anomalies of various types. We also introduce software called gnnad.
We further examine the strengths and weaknesses of this baseline approach, GDN, in comparison to other benchmarking methods on complex real-world river network data.
Findings suggest that GDN+ outperforms the baseline approach in high-dimensional data, while also providing improved interpretability.
水是河网的命脉,其质量对于维持水生态系统和人类社会都至关重要。实时水质监测越来越依赖于原位传感器技术。异常检测对于识别传感器数据中的错误模式至关重要,但由于数据的复杂性和可变性,即使在典型条件下,这也是一项具有挑战性的任务。本文提出了一种用于河网传感器数据异常检测的解决方案,这对于准确和连续的监测至关重要。
我们使用了一种图神经网络模型,即最近提出的图偏差网络(GDN),它采用基于图注意力的预测来捕捉传感器之间复杂的时空关系。我们提出了一种基于学习图的替代异常阈值标准,即 GDN+。为了评估模型的效果,我们引入了具有高度复杂依赖结构和各种类型子序列异常的新基准模拟实验。我们还引入了名为 gnnad 的软件。
我们进一步研究了这种基线方法 GDN 在复杂真实世界河网数据上与其他基准方法相比的优缺点。
研究结果表明,GDN+在高维数据中表现优于基线方法,同时提供了更好的可解释性。