• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

考虑单变量和多变量特征的环境传感器缺失数据的集成方法。

An Ensemble Method for Missing Data of Environmental Sensor Considering Univariate and Multivariate Characteristics.

机构信息

School of Statistics and Actuarial Science, Soongsil University, Seoul 06978, Korea.

School of Electronic Engineering, Soongsil University, Seoul 06978, Korea.

出版信息

Sensors (Basel). 2021 Nov 16;21(22):7595. doi: 10.3390/s21227595.

DOI:10.3390/s21227595
PMID:34833670
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8621076/
Abstract

With rapid urbanization, awareness of environmental pollution is growing rapidly and, accordingly, interest in environmental sensors that measure atmospheric and indoor air quality is increasing. Since these IoT-based environmental sensors are sensitive and value reliability, it is essential to deal with missing values, which are one of the causes of reliability problems. Characteristics that can be used to impute missing values in environmental sensors are the time dependency of single variables and the correlation between multivariate variables. However, in the existing method of imputing missing values, only one characteristic has been used and there has been no case where both characteristics were used. In this work, we introduced a new ensemble imputation method reflecting this. First, the cases in which missing values occur frequently were divided into four cases and were generated into the experimental data: communication error (aperiodic, periodic), sensor error (rapid change, measurement range). To compare the existing method with the proposed method, five methods of univariate imputation and five methods of multivariate imputation-both of which are widely used-were used as a single model to predict missing values for the four cases. The values predicted by a single model were applied to the ensemble method. Among the ensemble methods, the weighted average and stacking methods were used to derive the final predicted values and replace the missing values. Finally, the predicted values, substituted with the original data, were evaluated by a comparison between the mean absolute error (MAE) and the root mean square error (RMSE). The proposed ensemble method generally performed better than the single method. In addition, this method simultaneously considers the correlation between variables and time dependence, which are characteristics that must be considered in the environmental sensor. As a result, our proposed ensemble technique can contribute to the replacement of the missing values generated by environmental sensors, which can help to increase the reliability of environmental sensor data.

摘要

随着城市化进程的加快,人们对环境污染的认识迅速提高,因此,对测量大气和室内空气质量的环境传感器的兴趣也在不断增加。由于这些基于物联网的环境传感器很敏感且可靠性很重要,因此必须处理缺失值,这是可靠性问题的原因之一。可以用来推断环境传感器中缺失值的特征是单变量的时间依赖性和多变量变量之间的相关性。但是,在现有的缺失值推断方法中,仅使用了一个特征,并且尚未同时使用两个特征。在这项工作中,我们引入了一种新的集成推断方法来反映这一点。首先,将经常出现缺失值的情况分为四种情况,并将其生成到实验数据中:通信错误(非周期性,周期性),传感器错误(快速变化,测量范围)。为了将现有方法与所提出的方法进行比较,使用了五种单变量插补方法和五种广泛使用的多变量插补方法作为单个模型,为这四种情况预测缺失值。将单个模型预测的值应用于集成方法。在集成方法中,使用加权平均和堆叠方法得出最终预测值并替换缺失值。最后,通过比较平均绝对误差(MAE)和均方根误差(RMSE)来评估用原始数据替换后的预测值。所提出的集成方法通常比单个方法表现更好。此外,该方法同时考虑了环境传感器中必须考虑的变量之间的相关性和时间依赖性,这是必须考虑的特征。因此,我们提出的集成技术可以有助于替换由环境传感器产生的缺失值,这有助于提高环境传感器数据的可靠性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/73b3/8621076/ba69f8fa3367/sensors-21-07595-g015.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/73b3/8621076/9a45b1dcbde0/sensors-21-07595-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/73b3/8621076/87b07e1c7968/sensors-21-07595-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/73b3/8621076/e677338d8e26/sensors-21-07595-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/73b3/8621076/9bd51457b41b/sensors-21-07595-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/73b3/8621076/c0df17f3ecb0/sensors-21-07595-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/73b3/8621076/44f8cfbf84f2/sensors-21-07595-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/73b3/8621076/e57dffeab65d/sensors-21-07595-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/73b3/8621076/658bde694883/sensors-21-07595-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/73b3/8621076/586d6d47fad1/sensors-21-07595-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/73b3/8621076/49b91b5a0a03/sensors-21-07595-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/73b3/8621076/16b617fb4865/sensors-21-07595-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/73b3/8621076/11332759c3bc/sensors-21-07595-g012a.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/73b3/8621076/a5539b0c9dd7/sensors-21-07595-g013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/73b3/8621076/0d95b45994ff/sensors-21-07595-g014.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/73b3/8621076/ba69f8fa3367/sensors-21-07595-g015.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/73b3/8621076/9a45b1dcbde0/sensors-21-07595-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/73b3/8621076/87b07e1c7968/sensors-21-07595-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/73b3/8621076/e677338d8e26/sensors-21-07595-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/73b3/8621076/9bd51457b41b/sensors-21-07595-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/73b3/8621076/c0df17f3ecb0/sensors-21-07595-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/73b3/8621076/44f8cfbf84f2/sensors-21-07595-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/73b3/8621076/e57dffeab65d/sensors-21-07595-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/73b3/8621076/658bde694883/sensors-21-07595-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/73b3/8621076/586d6d47fad1/sensors-21-07595-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/73b3/8621076/49b91b5a0a03/sensors-21-07595-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/73b3/8621076/16b617fb4865/sensors-21-07595-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/73b3/8621076/11332759c3bc/sensors-21-07595-g012a.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/73b3/8621076/a5539b0c9dd7/sensors-21-07595-g013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/73b3/8621076/0d95b45994ff/sensors-21-07595-g014.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/73b3/8621076/ba69f8fa3367/sensors-21-07595-g015.jpg

相似文献

1
An Ensemble Method for Missing Data of Environmental Sensor Considering Univariate and Multivariate Characteristics.考虑单变量和多变量特征的环境传感器缺失数据的集成方法。
Sensors (Basel). 2021 Nov 16;21(22):7595. doi: 10.3390/s21227595.
2
Selection of statistical technique for imputation of single site-univariate and multisite-multivariate methods for particulate pollutants time series data with long gaps and high missing percentage.单站点单变量和多站点多变量方法在长时间间隔和高缺失率的颗粒物污染物时间序列数据插补中的统计技术选择。
Environ Sci Pollut Res Int. 2023 Jun;30(30):75469-75488. doi: 10.1007/s11356-023-27659-x. Epub 2023 May 23.
3
Advanced methods for missing values imputation based on similarity learning.基于相似性学习的缺失值插补先进方法。
PeerJ Comput Sci. 2021 Jul 21;7:e619. doi: 10.7717/peerj-cs.619. eCollection 2021.
4
Robust imputation method with context-aware voting ensemble model for management of water-quality data.具有上下文感知投票集成模型的稳健插补方法用于水质数据管理。
Water Res. 2023 Sep 1;243:120369. doi: 10.1016/j.watres.2023.120369. Epub 2023 Jul 16.
5
Comparison of imputation methods for missing production data of dairy cattle.奶牛生产数据缺失的插补方法比较。
Animal. 2023 Dec;17 Suppl 5:100921. doi: 10.1016/j.animal.2023.100921. Epub 2023 Jul 31.
6
Deep Learning Approach for Imputation of Missing Values in Actigraphy Data: Algorithm Development Study.深度学习方法在运动数据缺失值插补中的应用:算法开发研究。
JMIR Mhealth Uhealth. 2020 Jul 23;8(7):e16113. doi: 10.2196/16113.
7
Imputation by feature importance (IBFI): A methodology to envelop machine learning method for imputing missing patterns in time series data.基于特征重要性的插补(IBFI):一种封装机器学习方法以插补时间序列数据中缺失模式的方法。
PLoS One. 2022 Jan 13;17(1):e0262131. doi: 10.1371/journal.pone.0262131. eCollection 2022.
8
Imputation methods for addressing missing data in short-term monitoring of air pollutants.用于解决短期空气污染物监测中缺失数据的插补方法。
Sci Total Environ. 2020 Aug 15;730:139140. doi: 10.1016/j.scitotenv.2020.139140. Epub 2020 May 3.
9
An Approach towards Increasing Prediction Accuracy for the Recovery of Missing IoT Data Based on the GRNN-SGTM Ensemble.基于 GRNN-SGTM 集成的物联网缺失数据恢复预测精度提升方法。
Sensors (Basel). 2020 May 4;20(9):2625. doi: 10.3390/s20092625.
10
Handling Complex Missing Data Using Random Forest Approach for an Air Quality Monitoring Dataset: A Case Study of Kuwait Environmental Data (2012 to 2018).利用随机森林方法处理空气质量监测数据集的复杂缺失数据:以科威特环境数据(2012 年至 2018 年)为例。
Int J Environ Res Public Health. 2021 Feb 2;18(3):1333. doi: 10.3390/ijerph18031333.

本文引用的文献

1
Comparison of Missing Data Infilling Mechanisms for Recovering a Real-World Single Station Streamflow Observation.比较缺失数据插补机制,以恢复真实世界单站流量观测值。
Int J Environ Res Public Health. 2021 Aug 7;18(16):8375. doi: 10.3390/ijerph18168375.
2
Embedded FBG Sensor Based Impact Identification of CFRP Using Ensemble Learning.基于集成学习的 CFRP 嵌入式 FBG 传感器冲击识别。
Sensors (Basel). 2021 Feb 19;21(4):1452. doi: 10.3390/s21041452.
3
Research on a Gas Concentration Prediction Algorithm Based on Stacking.基于堆叠的气体浓度预测算法研究
Sensors (Basel). 2021 Feb 25;21(5):1597. doi: 10.3390/s21051597.
4
LoRaWAN for Smart City IoT Deployments: A Long Term Evaluation.LoRaWAN 在智慧城市物联网部署中的应用:一项长期评估。
Sensors (Basel). 2020 Jan 23;20(3):648. doi: 10.3390/s20030648.
5
Random forest-based imputation outperforms other methods for imputing LC-MS metabolomics data: a comparative study.基于随机森林的插补方法在 LC-MS 代谢组学数据插补方面优于其他方法:一项比较研究。
BMC Bioinformatics. 2019 Oct 11;20(1):492. doi: 10.1186/s12859-019-3110-0.
6
What drives environmental degradation? Evidence from 14 Sub-Saharan African countries.什么导致了环境恶化?来自撒哈拉以南非洲 14 个国家的证据。
Sci Total Environ. 2019 Mar 15;656:165-173. doi: 10.1016/j.scitotenv.2018.11.354. Epub 2018 Nov 26.
7
Environmental pollution and kidney diseases.环境污染与肾脏疾病。
Nat Rev Nephrol. 2018 May;14(5):313-324. doi: 10.1038/nrneph.2018.11. Epub 2018 Feb 26.
8
Time Series Analysis for Spatial Node Selection in Environment Monitoring Sensor Networks.环境监测传感器网络中空间节点选择的时间序列分析
Sensors (Basel). 2017 Dec 22;18(1):11. doi: 10.3390/s18010011.
9
Genotype Imputation with Millions of Reference Samples.使用数百万参考样本进行基因型填充
Am J Hum Genet. 2016 Jan 7;98(1):116-26. doi: 10.1016/j.ajhg.2015.11.020.
10
A New Missing Data Imputation Algorithm Applied to Electrical Data Loggers.一种应用于电子数据记录器的新型缺失数据插补算法。
Sensors (Basel). 2015 Dec 10;15(12):31069-82. doi: 10.3390/s151229842.