评估随机森林回归和多元线性回归在预测高度污染城市室内细颗粒物浓度中的应用。

Evaluation of random forest regression and multiple linear regression for predicting indoor fine particulate matter concentrations in a highly polluted city.

机构信息

Faculty of Health Sciences, Simon Fraser University, 8888 University Drive, Burnaby, BC, V5A 1S6, Canada.

School of Public Health, Mongolian National University of Medical Sciences, Zorig Street, Ulaanbaatar, 14210, Mongolia.

出版信息

Environ Pollut. 2019 Feb;245:746-753. doi: 10.1016/j.envpol.2018.11.034. Epub 2018 Nov 16.

DOI:10.1016/j.envpol.2018.11.034

PMID:30500754

Abstract

BACKGROUND

Indoor and outdoor fine particulate matter (PM) are both leading risk factors for death and disease, but making indoor measurements is often infeasible for large study populations.

METHODS

We developed models to predict indoor PM concentrations for pregnant women who were part of a randomized controlled trial of portable air cleaners in Ulaanbaatar, Mongolia. We used multiple linear regression (MLR) and random forest regression (RFR) to model indoor PM concentrations with 447 independent 7-day PM measurements and 87 potential predictor variables obtained from outdoor monitoring data, questionnaires, home assessments, and geographic data sets. We also developed blended models that combined the MLR and RFR approaches. All models were evaluated in a 10-fold cross-validation.

RESULTS

The predictors in the MLR model were season, outdoor PM concentration, the number of air cleaners deployed, and the density of gers (traditional felt-lined yurts) surrounding the apartments. MLR and RFR had similar performance in cross-validation (R = 50.2%, R = 48.9% respectively). The blended MLR model that included RFR predictions had the best performance (cross validation R = 81.5%). Intervention status alone explained only 6.0% of the variation in indoor PM concentrations.

CONCLUSIONS

We predicted a moderate amount of variation in indoor PM concentrations using easily obtained predictor variables and the models explained substantially more variation than intervention status alone. While RFR shows promise for modelling indoor concentrations, our results highlight the importance of out-of-sample validation when evaluating model performance. We also demonstrate the improved performance of blended MLR/RFR models in predicting indoor air pollution.

摘要

背景

室内和室外的细颗粒物（PM）都是导致死亡和疾病的主要危险因素，但对于大型研究人群来说，进行室内测量通常是不可行的。

方法

我们为蒙古乌兰巴托正在进行的空气净化器随机对照试验中的孕妇开发了预测室内 PM 浓度的模型。我们使用多元线性回归（MLR）和随机森林回归（RFR）来建立模型，使用 447 个独立的 7 天 PM 测量值和 87 个可能的预测变量来建立模型，这些预测变量来自户外监测数据、问卷调查、家庭评估和地理数据集。我们还开发了混合模型，结合了 MLR 和 RFR 方法。所有模型都在 10 折交叉验证中进行了评估。

结果

MLR 模型中的预测因子是季节、室外 PM 浓度、部署的空气净化器数量以及公寓周围蒙古包（传统的毡制帐篷）的密度。MLR 和 RFR 在交叉验证中的表现相似（R 分别为 50.2%和 48.9%）。包含 RFR 预测的混合 MLR 模型具有最佳的性能（交叉验证 R 为 81.5%）。干预状态本身仅解释了室内 PM 浓度变化的 6.0%。

结论

我们使用易于获得的预测变量来预测室内 PM 浓度的中等变化量，并且模型解释了比干预状态本身更多的变化。虽然 RFR 显示出在建模室内浓度方面的潜力，但我们的结果强调了在评估模型性能时进行样本外验证的重要性。我们还证明了混合 MLR/RFR 模型在预测室内空气污染方面的改进性能。

相似文献

Evaluation of random forest regression and multiple linear regression for predicting indoor fine particulate matter concentrations in a highly polluted city.评估随机森林回归和多元线性回归在预测高度污染城市室内细颗粒物浓度中的应用。

Environ Pollut. 2019 Feb;245:746-753. doi: 10.1016/j.envpol.2018.11.034. Epub 2018 Nov 16.

The effect of portable HEPA filter air cleaners on indoor PM concentrations and second hand tobacco smoke exposure among pregnant women in Ulaanbaatar, Mongolia: The UGAAR randomized controlled trial.便携式高效空气过滤器空气净化器对蒙古乌兰巴托孕妇室内 PM 浓度和二手烟暴露的影响：UGAAR 随机对照试验。

Sci Total Environ. 2018 Feb 15;615:1379-1389. doi: 10.1016/j.scitotenv.2017.09.291. Epub 2017 Oct 17.

Construction and evaluation of hourly average indoor PM concentration prediction models based on multiple types of places.基于多种场所的逐时平均室内 PM 浓度预测模型的构建与评估。

Front Public Health. 2023 Aug 10;11:1213453. doi: 10.3389/fpubh.2023.1213453. eCollection 2023.

Evaluating heterogeneity in indoor and outdoor air pollution using land-use regression and constrained factor analysis.利用土地利用回归和约束因子分析评估室内和室外空气污染的异质性。

Res Rep Health Eff Inst. 2010 Dec(152):5-80; discussion 81-91.

Analysis of Personal and Home Characteristics Associated with the Elemental Composition of PM2.5 in Indoor, Outdoor, and Personal Air in the RIOPA Study.RIOPA研究中与室内、室外及个人空气中PM2.5元素组成相关的个人及家庭特征分析

Res Rep Health Eff Inst. 2015 Dec(185):3-40.

Investigation and modeling of the residential infiltration of fine particulate matter in Beijing, China.中国北京住宅细颗粒物渗透的调查与建模

J Air Waste Manag Assoc. 2017 Jun;67(6):694-701. doi: 10.1080/10962247.2016.1272503. Epub 2016 Dec 23.

Estimation of residential fine particulate matter infiltration in Shanghai, China.中国上海住宅细颗粒物渗透的估算。

Environ Pollut. 2018 Feb;233:494-500. doi: 10.1016/j.envpol.2017.10.054. Epub 2017 Nov 5.

Exposure and health impact evaluation based on simultaneous measurement of indoor and ambient PM in Haidian, Beijing.基于北京海淀区室内外颗粒物同步监测的暴露与健康影响评估。

Environ Pollut. 2017 Jan;220(Pt A):704-712. doi: 10.1016/j.envpol.2016.10.035. Epub 2016 Oct 18.

Exposure to PM2.5 and Blood Lead Level in Two Populations in Ulaanbaatar, Mongolia.蒙古国乌兰巴托市两个人群中细颗粒物（PM2.5）暴露与血铅水平

Int J Environ Res Public Health. 2016 Feb 15;13(2):214. doi: 10.3390/ijerph13020214.

Health assessment of future PM2.5 exposures from indoor, outdoor, and secondhand tobacco smoke concentrations under alternative policy pathways in Ulaanbaatar, Mongolia.蒙古乌兰巴托市在替代政策路径下，基于室内、室外及二手烟浓度对未来细颗粒物2.5暴露情况的健康评估

PLoS One. 2017 Oct 31;12(10):e0186834. doi: 10.1371/journal.pone.0186834. eCollection 2017.

引用本文的文献

Image-based and ML-driven analysis for assessing blueberry fruit quality.基于图像和机器学习驱动的蓝莓果实品质评估分析。

Heliyon. 2025 Jan 27;11(3):e42288. doi: 10.1016/j.heliyon.2025.e42288. eCollection 2025 Feb 15.

Multi-Scenario Validation and Assessment of a Particulate Matter Sensor Monitor Optimized by Machine Learning Methods.基于机器学习方法优化的颗粒物传感器监测仪的多场景验证与评估

Sensors (Basel). 2024 May 27;24(11):3448. doi: 10.3390/s24113448.

Front Public Health. 2023 Aug 10;11:1213453. doi: 10.3389/fpubh.2023.1213453. eCollection 2023.

Proposal of a Methodology for Prediction of Indoor PM Concentration Using Sensor-Based Residential Environments Monitoring Data and Time-Divided Multiple Linear Regression Model.基于传感器的住宅环境监测数据和时分多元线性回归模型预测室内颗粒物浓度的方法建议

Toxics. 2023 Jun 12;11(6):526. doi: 10.3390/toxics11060526.

Development and application of random forest regression soft sensor model for treating domestic wastewater in a sequencing batch reactor.序批式反应器处理生活污水的随机森林回归软测量模型的开发与应用。

Sci Rep. 2023 Jun 5;13(1):9149. doi: 10.1038/s41598-023-36333-8.

Machine learning-based ozone and PM2.5 forecasting: Application to multiple AQS sites in the Pacific Northwest.基于机器学习的臭氧和PM2.5预测：在太平洋西北地区多个空气质量监测站点的应用

Front Big Data. 2023 Feb 24;6:1124148. doi: 10.3389/fdata.2023.1124148. eCollection 2023.

Predicting COVID-19 using lioness optimization algorithm and graph convolution network.使用狮子优化算法和图卷积网络预测新型冠状病毒肺炎

Soft comput. 2023;27(9):5437-5501. doi: 10.1007/s00500-022-07778-2. Epub 2023 Jan 9.

Ranking the environmental factors of indoor air quality of metropolitan independent coffee shops by Random Forests model.随机森林模型对大都市独立咖啡店室内空气质量环境因素的排名。

Sci Rep. 2022 Sep 26;12(1):16057. doi: 10.1038/s41598-022-20421-2.

Investigation of COVID-19-related lockdowns on the air pollution changes in augsburg in 2020, Germany.2020年德国奥格斯堡新冠疫情相关封锁措施对空气污染变化的调查。

Atmos Pollut Res. 2022 Sep;13(9):101536. doi: 10.1016/j.apr.2022.101536. Epub 2022 Aug 21.

Updating Indoor Air Quality (IAQ) Assessment Screening Levels with Machine Learning Models.利用机器学习模型更新室内空气质量 (IAQ) 评估筛选标准。

Int J Environ Res Public Health. 2022 May 8;19(9):5724. doi: 10.3390/ijerph19095724.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

评估随机森林回归和多元线性回归在预测高度污染城市室内细颗粒物浓度中的应用。

Evaluation of random forest regression and multiple linear regression for predicting indoor fine particulate matter concentrations in a highly polluted city.

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSIONS

背景

方法

结果

结论

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献