Research Center for Environmental Changes, Academia Sinica, Nangang, Taipei 115, Taiwan.
Department of Atmospheric Sciences, National Taiwan University, Taipei 106, Taiwan.
Sensors (Basel). 2020 Sep 3;20(17):5002. doi: 10.3390/s20175002.
Many low-cost sensors (LCSs) are distributed for air monitoring without any rigorous calibrations. This work applies machine learning with PM from Taiwan monitoring stations to conduct in-field corrections on a network of 39 PM LCSs from July 2017 to December 2018. Three candidate models were evaluated: Multiple linear regression (MLR), support vector regression (SVR), and random forest regression (RFR). The model-corrected PM levels were compared with those of GRIMM-calibrated PM. RFR was superior to MLR and SVR in its correction accuracy and computing efficiency. Compared to SVR, the root mean square errors (RMSEs) of RFR were 35% and 85% lower for the training and validation sets, respectively, and the computational speed was 35 times faster. An RFR with 300 decision trees was chosen as the optimal setting considering both the correction performance and the modeling time. An RFR with a nighttime pattern was established as the optimal correction model, and the RMSEs were 5.9 ± 2.0 μg/m, reduced from 18.4 ± 6.5 μg/m before correction. This is the first work to correct LCSs at locations without monitoring stations, validated using laboratory-calibrated data. Similar models could be established in other countries to greatly enhance the usefulness of their PM sensor networks.
许多低成本传感器(LCS)在没有任何严格校准的情况下就被用于空气质量监测。本工作应用机器学习,利用来自台湾监测站的 PM 数据,对 2017 年 7 月至 2018 年 12 月期间的 39 个 PM LCS 网络进行现场校正。评估了三个候选模型:多元线性回归(MLR)、支持向量回归(SVR)和随机森林回归(RFR)。将模型校正后的 PM 水平与经过 GRIMM 校准的 PM 水平进行了比较。RFR 在校正准确性和计算效率方面均优于 MLR 和 SVR。与 SVR 相比,RFR 的训练集和验证集的均方根误差(RMSE)分别降低了 35%和 85%,计算速度提高了 35 倍。考虑到校正性能和建模时间,选择了 300 棵决策树的 RFR 作为最佳设置。建立了夜间模式的 RFR 作为最佳校正模型,校正后的 RMSE 为 5.9±2.0μg/m,比校正前的 18.4±6.5μg/m 降低了 58.5%。这是首次在没有监测站的位置对 LCS 进行校正的工作,使用实验室校准数据进行了验证。类似的模型可以在其他国家建立,以极大地提高其 PM 传感器网络的有用性。