Suppr超能文献

基于总误差框架的随机森林在空气质量监测站中的误差预测

Error Prediction of Air Quality at Monitoring Stations Using Random Forest in a Total Error Framework.

机构信息

NILU-Norwegian Institute for Air Research, Postboks 100, 2027 Kjeller, Norway.

出版信息

Sensors (Basel). 2021 Mar 19;21(6):2160. doi: 10.3390/s21062160.

Abstract

Instead of a flag valid/non-valid usually proposed in the quality control (QC) processes of air quality (AQ), we proposed a method that predicts the -value of each observation as a value between 0 and 1. We based our error predictions on three approaches: the one proposed by the Working Group on Guidance for the Demonstration of Equivalence (European Commission (2010)), the one proposed by Wager (Journal of MachineLearningResearch, 15, 1625-1651 (2014)) and the one proposed by Lu (Journal of MachineLearningResearch, 22, 1-41 (2021)). Total Error framework enables to differentiate the different errors: input, output, structural modeling and remnant. We thus theoretically described a one-site AQ prediction based on a multi-site network using Random Forest for regression in a Total Error framework. We demonstrated the methodology with a dataset of hourly nitrogen dioxide measured by a network of monitoring stations located in Oslo, Norway and implemented the error predictions for the three approaches. The results indicate that a simple one-site AQ prediction based on a multi-site network using Random Forest for regression provides moderate metrics for fixed stations. According to the diagnostic based on predictive qq-plot and among the three approaches used in this study, the approach proposed by Lu provides better error predictions. Furthermore, ensuring a high precision of the error prediction requires efforts on getting accurate input, output and prediction model and limiting our lack of knowledge about the "true" AQ phenomena. We put effort in quantifying each type of error involved in the error prediction to assess the error prediction model and further improving it in terms of performance and precision.

摘要

我们提出了一种方法,即用介于 0 和 1 之间的值来预测每个观测值的 - 值,而不是在空气质量 (AQ) 的质量控制 (QC) 过程中通常提出的有效/无效标志。我们的误差预测基于三种方法:欧洲委员会(2010 年)工作组提出的方法、Wager(Journal of MachineLearningResearch,15,1625-1651(2014 年)提出的方法和 Lu(Journal of MachineLearningResearch,22,1-41(2021 年)提出的方法。总误差框架能够区分不同的误差:输入、输出、结构建模和残余。因此,我们从理论上描述了一种基于多站点网络的单点 AQ 预测,该网络使用随机森林进行总误差框架下的回归。我们使用位于挪威奥斯陆的监测站网络测量的每小时二氧化氮的数据集演示了该方法,并为三种方法实现了误差预测。结果表明,基于多站点网络使用随机森林进行回归的单点 AQ 预测为固定站提供了中等指标。根据基于预测 qq-plot 的诊断以及本研究中使用的三种方法,Lu 提出的方法提供了更好的误差预测。此外,要确保误差预测的高精度,需要努力获得准确的输入、输出和预测模型,并限制我们对“真实”AQ 现象的了解不足。我们努力量化误差预测中涉及的每种类型的误差,以评估误差预测模型,并进一步提高其性能和精度。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e00f/8003348/2c09baad7878/sensors-21-02160-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验