Suppr超能文献

考虑单变量和多变量特征的环境传感器缺失数据的集成方法。

An Ensemble Method for Missing Data of Environmental Sensor Considering Univariate and Multivariate Characteristics.

机构信息

School of Statistics and Actuarial Science, Soongsil University, Seoul 06978, Korea.

School of Electronic Engineering, Soongsil University, Seoul 06978, Korea.

出版信息

Sensors (Basel). 2021 Nov 16;21(22):7595. doi: 10.3390/s21227595.

Abstract

With rapid urbanization, awareness of environmental pollution is growing rapidly and, accordingly, interest in environmental sensors that measure atmospheric and indoor air quality is increasing. Since these IoT-based environmental sensors are sensitive and value reliability, it is essential to deal with missing values, which are one of the causes of reliability problems. Characteristics that can be used to impute missing values in environmental sensors are the time dependency of single variables and the correlation between multivariate variables. However, in the existing method of imputing missing values, only one characteristic has been used and there has been no case where both characteristics were used. In this work, we introduced a new ensemble imputation method reflecting this. First, the cases in which missing values occur frequently were divided into four cases and were generated into the experimental data: communication error (aperiodic, periodic), sensor error (rapid change, measurement range). To compare the existing method with the proposed method, five methods of univariate imputation and five methods of multivariate imputation-both of which are widely used-were used as a single model to predict missing values for the four cases. The values predicted by a single model were applied to the ensemble method. Among the ensemble methods, the weighted average and stacking methods were used to derive the final predicted values and replace the missing values. Finally, the predicted values, substituted with the original data, were evaluated by a comparison between the mean absolute error (MAE) and the root mean square error (RMSE). The proposed ensemble method generally performed better than the single method. In addition, this method simultaneously considers the correlation between variables and time dependence, which are characteristics that must be considered in the environmental sensor. As a result, our proposed ensemble technique can contribute to the replacement of the missing values generated by environmental sensors, which can help to increase the reliability of environmental sensor data.

摘要

随着城市化进程的加快,人们对环境污染的认识迅速提高,因此,对测量大气和室内空气质量的环境传感器的兴趣也在不断增加。由于这些基于物联网的环境传感器很敏感且可靠性很重要,因此必须处理缺失值,这是可靠性问题的原因之一。可以用来推断环境传感器中缺失值的特征是单变量的时间依赖性和多变量变量之间的相关性。但是,在现有的缺失值推断方法中,仅使用了一个特征,并且尚未同时使用两个特征。在这项工作中,我们引入了一种新的集成推断方法来反映这一点。首先,将经常出现缺失值的情况分为四种情况,并将其生成到实验数据中:通信错误(非周期性,周期性),传感器错误(快速变化,测量范围)。为了将现有方法与所提出的方法进行比较,使用了五种单变量插补方法和五种广泛使用的多变量插补方法作为单个模型,为这四种情况预测缺失值。将单个模型预测的值应用于集成方法。在集成方法中,使用加权平均和堆叠方法得出最终预测值并替换缺失值。最后,通过比较平均绝对误差(MAE)和均方根误差(RMSE)来评估用原始数据替换后的预测值。所提出的集成方法通常比单个方法表现更好。此外,该方法同时考虑了环境传感器中必须考虑的变量之间的相关性和时间依赖性,这是必须考虑的特征。因此,我们提出的集成技术可以有助于替换由环境传感器产生的缺失值,这有助于提高环境传感器数据的可靠性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/73b3/8621076/9a45b1dcbde0/sensors-21-07595-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验