基于总误差框架的随机森林在空气质量监测站中的误差预测

Error Prediction of Air Quality at Monitoring Stations Using Random Forest in a Total Error Framework.

机构信息

NILU-Norwegian Institute for Air Research, Postboks 100, 2027 Kjeller, Norway.

出版信息

Sensors (Basel). 2021 Mar 19;21(6):2160. doi: 10.3390/s21062160.

DOI:10.3390/s21062160

PMID:33808772

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8003348/

Abstract

Instead of a flag valid/non-valid usually proposed in the quality control (QC) processes of air quality (AQ), we proposed a method that predicts the -value of each observation as a value between 0 and 1. We based our error predictions on three approaches: the one proposed by the Working Group on Guidance for the Demonstration of Equivalence (European Commission (2010)), the one proposed by Wager (Journal of MachineLearningResearch, 15, 1625-1651 (2014)) and the one proposed by Lu (Journal of MachineLearningResearch, 22, 1-41 (2021)). Total Error framework enables to differentiate the different errors: input, output, structural modeling and remnant. We thus theoretically described a one-site AQ prediction based on a multi-site network using Random Forest for regression in a Total Error framework. We demonstrated the methodology with a dataset of hourly nitrogen dioxide measured by a network of monitoring stations located in Oslo, Norway and implemented the error predictions for the three approaches. The results indicate that a simple one-site AQ prediction based on a multi-site network using Random Forest for regression provides moderate metrics for fixed stations. According to the diagnostic based on predictive qq-plot and among the three approaches used in this study, the approach proposed by Lu provides better error predictions. Furthermore, ensuring a high precision of the error prediction requires efforts on getting accurate input, output and prediction model and limiting our lack of knowledge about the "true" AQ phenomena. We put effort in quantifying each type of error involved in the error prediction to assess the error prediction model and further improving it in terms of performance and precision.

摘要

我们提出了一种方法，即用介于 0 和 1 之间的值来预测每个观测值的 - 值，而不是在空气质量 (AQ) 的质量控制 (QC) 过程中通常提出的有效/无效标志。我们的误差预测基于三种方法：欧洲委员会（2010 年）工作组提出的方法、Wager（Journal of MachineLearningResearch，15，1625-1651（2014 年）提出的方法和 Lu（Journal of MachineLearningResearch，22，1-41（2021 年）提出的方法。总误差框架能够区分不同的误差：输入、输出、结构建模和残余。因此，我们从理论上描述了一种基于多站点网络的单点 AQ 预测，该网络使用随机森林进行总误差框架下的回归。我们使用位于挪威奥斯陆的监测站网络测量的每小时二氧化氮的数据集演示了该方法，并为三种方法实现了误差预测。结果表明，基于多站点网络使用随机森林进行回归的单点 AQ 预测为固定站提供了中等指标。根据基于预测 qq-plot 的诊断以及本研究中使用的三种方法，Lu 提出的方法提供了更好的误差预测。此外，要确保误差预测的高精度，需要努力获得准确的输入、输出和预测模型，并限制我们对“真实”AQ 现象的了解不足。我们努力量化误差预测中涉及的每种类型的误差，以评估误差预测模型，并进一步提高其性能和精度。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e00f/8003348/2c09baad7878/sensors-21-02160-g001.jpg

相似文献

Error Prediction of Air Quality at Monitoring Stations Using Random Forest in a Total Error Framework.基于总误差框架的随机森林在空气质量监测站中的误差预测

Sensors (Basel). 2021 Mar 19;21(6):2160. doi: 10.3390/s21062160.

Mapping urban air quality in near real-time using observations from low-cost sensors and model information.利用低成本传感器观测数据和模型信息实时绘制城市空气质量图。

Environ Int. 2017 Sep;106:234-247. doi: 10.1016/j.envint.2017.05.005. Epub 2017 Jun 28.

A multicenter random forest model for effective prognosis prediction in collaborative clinical research network.多中心随机森林模型在协作临床研究网络中的有效预后预测。

Artif Intell Med. 2020 Mar;103:101814. doi: 10.1016/j.artmed.2020.101814. Epub 2020 Feb 5.

Determination of the physical domain for air quality monitoring stations using the ANP-OWA method in GIS.利用 GIS 中的 ANP-OWA 方法确定空气质量监测站的物理域。

Environ Monit Assess. 2019 Jun 28;191(Suppl 2):299. doi: 10.1007/s10661-019-7422-3.

The impact of the congestion charging scheme on air quality in London. Part 1. Emissions modeling and analysis of air pollution measurements.拥堵收费计划对伦敦空气质量的影响。第1部分。排放建模与空气污染测量分析。

Res Rep Health Eff Inst. 2011 Apr(155):5-71.

Air quality prediction at new stations using spatially transferred bi-directional long short-term memory network.利用空间转移双向长短时记忆网络对新站点的空气质量进行预测。

Sci Total Environ. 2020 Feb 25;705:135771. doi: 10.1016/j.scitotenv.2019.135771. Epub 2019 Nov 26.

Random forest as a generic framework for predictive modeling of spatial and spatio-temporal variables.随机森林作为空间和时空变量预测建模的通用框架。

PeerJ. 2018 Aug 29;6:e5518. doi: 10.7717/peerj.5518. eCollection 2018.

A methodological framework for improving air quality monitoring network layout. Applications to environment management.一种改进空气质量监测网络布局的方法框架。在环境管理中的应用。

J Environ Sci (China). 2021 Apr;102:138-147. doi: 10.1016/j.jes.2020.09.009. Epub 2020 Sep 30.

Long short-term memory neural network for air pollutant concentration predictions: Method development and evaluation.用于空气污染物浓度预测的长短期记忆神经网络：方法开发与评估。

Environ Pollut. 2017 Dec;231(Pt 1):997-1004. doi: 10.1016/j.envpol.2017.08.114. Epub 2017 Sep 25.

Assessment and statistical modeling of the relationship between remotely sensed aerosol optical depth and PM2.5 in the eastern United States.美国东部地区遥感气溶胶光学厚度与PM2.5之间关系的评估及统计建模

Res Rep Health Eff Inst. 2012 May(167):5-83; discussion 85-91.

引用本文的文献

Prediction and assessment of the impact of COVID-19 lockdown on air quality over Kolkata: a deep transfer learning approach.预测和评估 COVID-19 封锁对加尔各答空气质量的影响：一种深度迁移学习方法。

Environ Monit Assess. 2022 Dec 22;195(1):223. doi: 10.1007/s10661-022-10761-x.

本文引用的文献

Distributed Deep Fusion Predictor for a Multi-Sensor System Based on Causality Entropy.基于因果熵的多传感器系统分布式深度融合预测器

Entropy (Basel). 2021 Feb 11;23(2):219. doi: 10.3390/e23020219.

A hybrid model for spatiotemporal forecasting of PM based on graph convolutional neural network and long short-term memory.基于图卷积神经网络和长短时记忆的 PM 时空预测混合模型。

Sci Total Environ. 2019 May 10;664:1-10. doi: 10.1016/j.scitotenv.2019.01.333. Epub 2019 Feb 1.

Long short-term memory - Fully connected (LSTM-FC) neural network for PM concentration prediction.长短期记忆-全连接（LSTM-FC）神经网络用于 PM 浓度预测。

Chemosphere. 2019 Apr;220:486-492. doi: 10.1016/j.chemosphere.2018.12.128. Epub 2018 Dec 21.

A random forest partition model for predicting NO concentrations from traffic flow and meteorological conditions.基于交通流量和气象条件预测 NO 浓度的随机森林分区模型。

Sci Total Environ. 2019 Feb 15;651(Pt 1):475-483. doi: 10.1016/j.scitotenv.2018.09.196. Epub 2018 Sep 17.

Random forest as a generic framework for predictive modeling of spatial and spatio-temporal variables.随机森林作为空间和时空变量预测建模的通用框架。

PeerJ. 2018 Aug 29;6:e5518. doi: 10.7717/peerj.5518. eCollection 2018.

A Deep CNN-LSTM Model for Particulate Matter (PM) Forecasting in Smart Cities.基于深度学习的城市细颗粒物预测模型。

Sensors (Basel). 2018 Jul 10;18(7):2220. doi: 10.3390/s18072220.

Environ Pollut. 2017 Dec;231(Pt 1):997-1004. doi: 10.1016/j.envpol.2017.08.114. Epub 2017 Sep 25.

Mapping urban air quality in near real-time using observations from low-cost sensors and model information.利用低成本传感器观测数据和模型信息实时绘制城市空气质量图。

Environ Int. 2017 Sep;106:234-247. doi: 10.1016/j.envint.2017.05.005. Epub 2017 Jun 28.

Deep learning architecture for air quality predictions.深度学习架构在空气质量预测中的应用。

Environ Sci Pollut Res Int. 2016 Nov;23(22):22408-22417. doi: 10.1007/s11356-016-7812-9. Epub 2016 Oct 13.

A Comparative Evaluation of Unsupervised Anomaly Detection Algorithms for Multivariate Data.多元数据无监督异常检测算法的比较评估

PLoS One. 2016 Apr 19;11(4):e0152173. doi: 10.1371/journal.pone.0152173. eCollection 2016.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

基于总误差框架的随机森林在空气质量监测站中的误差预测

Error Prediction of Air Quality at Monitoring Stations Using Random Forest in a Total Error Framework.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献