State Key Laboratory of Marine Geology, Tongji University, Shanghai 200092, China.
School of Information Technology and Electrical Engineering, The University of Queensland, St. Lucia, QLD 4072, Australia.
Sensors (Basel). 2018 Aug 10;18(8):2628. doi: 10.3390/s18082628.
With the construction and deployment of seafloor observatories around the world, massive amounts of oceanographic measurement data were gathered and transmitted to data centers. The increase in the amount of observed data not only provides support for marine scientific research but also raises the requirements for data quality control, as scientists must ensure that their research outcomes come from high-quality data. In this paper, we first analyzed and defined data quality problems occurring in the East China Sea Seafloor Observatory System (ECSSOS). We then proposed a method to detect and repair the data quality problems of seafloor observatories. Incorporating data statistics and expert knowledge from domain specialists, the proposed method consists of three parts: a general pretest to preprocess data and provide a router for further processing, data outlier detection methods to label suspect data points, and a data interpolation method to fill up missing and suspect data. The autoregressive integrated moving average (ARIMA) model was improved and applied to seafloor observatory data quality control by using a sliding window and cleaning the input modeling data. Furthermore, a quality control flag system was also proposed and applied to describe data quality control results and processing procedure information. The real observed data in ECSSOS were used to implement and test the proposed method. The results demonstrated that the proposed method performed effectively at detecting and repairing data quality problems for seafloor observatory data.
随着世界各地海底观测站的建设和部署,大量的海洋测量数据被收集并传输到数据中心。观测数据量的增加不仅为海洋科学研究提供了支持,也对数据质量控制提出了更高的要求,因为科学家必须确保他们的研究结果来自高质量的数据。在本文中,我们首先分析并定义了东海海底观测站系统(ECSSOS)中出现的数据质量问题。然后,我们提出了一种检测和修复海底观测站数据质量问题的方法。该方法结合了数据统计和领域专家的知识,包括三个部分:一般的预测试,用于预处理数据并为进一步处理提供路由器;数据异常值检测方法,用于标记可疑数据点;以及数据插值方法,用于填补缺失和可疑数据。我们还改进了自回归综合移动平均(ARIMA)模型,并通过使用滑动窗口和清理输入建模数据,将其应用于海底观测站数据质量控制。此外,我们还提出并应用了一个质量控制标志系统,用于描述数据质量控制结果和处理过程信息。我们使用 ECSSOS 中的真实观测数据来实施和测试所提出的方法。结果表明,该方法在检测和修复海底观测站数据的数据质量问题方面表现出了有效性。