• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于混合随机森林和ARIMA模型的空气质量指数可解释预测

Explainable forecasting of air quality index using a hybrid random forest and ARIMA model.

作者信息

Yenkikar Anuradha, Mishra Ved Prakash, Bali Manish, Ara Tabassum

机构信息

School of Engineering, Amity University Dubai Campus, Dubai, 25314, United Arab Emirates.

Department of CSE(AI), Vishwakarma Institute of Technology, Pune, India.

出版信息

MethodsX. 2025 Jul 18;15:103517. doi: 10.1016/j.mex.2025.103517. eCollection 2025 Dec.

DOI:10.1016/j.mex.2025.103517
PMID:40777582
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12329590/
Abstract

Accurate and interpretable prediction of the Air Quality Index (AQI) is critical for public health decision-making and environmental policy enforcement. This study presents a hybrid forecasting framework that combines the strengths of Random Forest Regression (RFR) and Autoregressive Integrated Moving Average (ARIMA) models to improve AQI prediction accuracy while maintaining model transparency. The RFR captures nonlinear relationships among pollutants, while ARIMA is used to model the temporal patterns in RFR residuals, forming a two-stage learning architecture. The model is trained and evaluated on multi-year AQI data from India and validated using an expanding window cross-validation strategy to maintain temporal integrity. To ensure transparency and interpretability, the study employs SHAP ((SHapley Additive Explanations) to uncover the influence of key pollutants such as PM₂.₅, NO₂, and SO₂. Additionally, Ljung-Box diagnostics and uncertainty bands are used to validate model adequacy. Compared to baseline models, the hybrid approach achieves lower Mean Squared Error (MSE = 508.46) and a higher R² score (0.94), confirming improved generalization. This research contributes a replicable, explainable, and efficient AQI forecasting framework suited for deployment in resource-constrained urban environments. The method comprises of: Residual learning hybrid model: Random Forest for prediction + ARIMA for residual correction Time-aware validation using expanding window cross-validation Model interpretability through SHAP analysis.

摘要

准确且可解释的空气质量指数(AQI)预测对于公共卫生决策和环境政策执行至关重要。本研究提出了一种混合预测框架,该框架结合了随机森林回归(RFR)和自回归积分移动平均(ARIMA)模型的优势,以提高AQI预测准确性,同时保持模型的透明度。RFR捕捉污染物之间的非线性关系,而ARIMA用于对RFR残差中的时间模式进行建模,形成一个两阶段学习架构。该模型在来自印度的多年AQI数据上进行训练和评估,并使用扩展窗口交叉验证策略进行验证,以保持时间完整性。为确保透明度和可解释性,该研究采用SHAP(夏普利值加法解释)来揭示关键污染物(如PM₂.₅、NO₂和SO₂)的影响。此外,使用Ljung-Box诊断和不确定性带验证模型的充分性。与基线模型相比,混合方法实现了更低的均方误差(MSE = 508.46)和更高的R²分数(0.94),证实了泛化能力的提高。本研究贡献了一个适用于在资源受限的城市环境中部署的可复制、可解释且高效的AQI预测框架。该方法包括:残差学习混合模型:用于预测的随机森林 + 用于残差校正的ARIMA 使用扩展窗口交叉验证的时间感知验证 通过SHAP分析实现模型可解释性。

相似文献

1
Explainable forecasting of air quality index using a hybrid random forest and ARIMA model.基于混合随机森林和ARIMA模型的空气质量指数可解释预测
MethodsX. 2025 Jul 18;15:103517. doi: 10.1016/j.mex.2025.103517. eCollection 2025 Dec.
2
Economic burden of breast cancer in India, 2000-2021 and forecast to 2030.2000 - 2021年印度乳腺癌的经济负担及到2030年的预测
Sci Rep. 2025 Jan 8;15(1):1323. doi: 10.1038/s41598-024-83896-1.
3
An explainable AI-based hybrid machine learning model for interpretability and enhanced crop yield prediction.一种基于可解释人工智能的混合机器学习模型,用于可解释性和增强作物产量预测。
MethodsX. 2025 Jun 17;15:103442. doi: 10.1016/j.mex.2025.103442. eCollection 2025 Dec.
4
A Responsible Framework for Assessing, Selecting, and Explaining Machine Learning Models in Cardiovascular Disease Outcomes Among People With Type 2 Diabetes: Methodology and Validation Study.用于评估、选择和解释2型糖尿病患者心血管疾病结局机器学习模型的责任框架:方法与验证研究
JMIR Med Inform. 2025 Jun 27;13:e66200. doi: 10.2196/66200.
5
Supervised Machine Learning Models for Predicting Sepsis-Associated Liver Injury in Patients With Sepsis: Development and Validation Study Based on a Multicenter Cohort Study.用于预测脓毒症患者脓毒症相关肝损伤的监督式机器学习模型:基于多中心队列研究的开发与验证研究
J Med Internet Res. 2025 May 26;27:e66733. doi: 10.2196/66733.
6
Interpretable Machine Learning for Serum-Based Metabolomics in Breast Cancer Diagnostics: Insights from Multi-Objective Feature Selection-Driven LightGBM-SHAP Models.用于乳腺癌诊断的基于血清代谢组学的可解释机器学习:多目标特征选择驱动的LightGBM-SHAP模型的见解
Medicina (Kaunas). 2025 Jun 19;61(6):1112. doi: 10.3390/medicina61061112.
7
Stabilizing machine learning for reproducible and explainable results: A novel validation approach to subject-specific insights.稳定机器学习以获得可重复和可解释的结果:一种针对特定个体见解的新型验证方法。
Comput Methods Programs Biomed. 2025 Jun 21;269:108899. doi: 10.1016/j.cmpb.2025.108899.
8
Machine learning framework for oxytetracycline removal using nanostructured cupric oxide supported on magnetic chitosan alginate biocomposite.基于磁性壳聚糖海藻酸盐生物复合材料负载纳米结构氧化铜去除土霉素的机器学习框架
Sci Rep. 2025 Jul 18;15(1):26124. doi: 10.1038/s41598-025-11424-w.
9
Explainable AI-driven assessment of hydro climatic interactions shaping river discharge dynamics in a monsoonal basin.可解释人工智能驱动的对塑造季风流域河流流量动态的水文气候相互作用的评估。
Sci Rep. 2025 Jul 26;15(1):27302. doi: 10.1038/s41598-025-13221-x.
10
Forecasting tuberculosis epidemics using an autoregressive fractionally integrated moving average model: a 17-year time series analysis.使用自回归分数整合移动平均模型预测结核病流行趋势:一项17年时间序列分析
J Glob Health. 2025 Jul 25;15:04215. doi: 10.7189/jogh.15.04215.

本文引用的文献

1
An explainable AI-based hybrid machine learning model for interpretability and enhanced crop yield prediction.一种基于可解释人工智能的混合机器学习模型,用于可解释性和增强作物产量预测。
MethodsX. 2025 Jun 17;15:103442. doi: 10.1016/j.mex.2025.103442. eCollection 2025 Dec.
2
Artificial bee colony optimized random forest model for prediction of fly ash concrete compressive strength.基于人工蜂群优化随机森林模型的粉煤灰混凝土抗压强度预测
MethodsX. 2025 Jun 1;14:103412. doi: 10.1016/j.mex.2025.103412. eCollection 2025 Jun.
3
Optimized machine learning model for air quality index prediction in major cities in India.
印度主要城市空气质量指数预测的优化机器学习模型。
Sci Rep. 2024 Mar 21;14(1):6795. doi: 10.1038/s41598-024-54807-1.
4
Air Quality Index prediction using an effective hybrid deep learning model.利用有效的混合深度学习模型预测空气质量指数。
Environ Pollut. 2022 Dec 15;315:120404. doi: 10.1016/j.envpol.2022.120404. Epub 2022 Oct 11.
5
Air pollution prediction by using an artificial neural network model.利用人工神经网络模型进行空气污染预测。
Clean Technol Environ Policy. 2019 Aug;21(6):1341-1352. doi: 10.1007/s10098-019-01709-w. Epub 2019 May 28.