Helyar Simon, Alnaggar Aliaa
Mechanical, Industrial and Mechatronics Engineering Department, Toronto Metropolitan University, 350 Victoria Street, Toronto, M5B 2K3, Ontario, Canada.
J Environ Manage. 2025 Aug;389:125540. doi: 10.1016/j.jenvman.2025.125540. Epub 2025 Jun 14.
Poor air quality poses significant threats to public health and environmental sustainability. To mitigate such risks, accurate air quality prediction is essential to inform intervention policies that effectively reduce pollutant levels. While past research has focused on forecasting air quality trends, this paper proposes a novel predict-then-optimize framework that integrates machine learning models with a two-stage stochastic programming model. Our approach first forecasts fine particulate matter (PM2.5) levels then leverages these predictions in an optimization model to identify mitigation strategies for cities in Ontario, Canada. In the prediction phase, we develop and evaluate multiple machine learning models, including Random Forest, XGBoost, LSTM, Stacked LSTM, and ensemble architectures. These models leverage meteorological, wildfire, and historical air quality data. The predictions from the best-performing model are then used as inputs to a two-stage stochastic programming model, which selects optimal intervention policies for different cities while considering uncertainty in pollutant levels and adhering to budget constraints. Extensive computational experiments demonstrate the ensemble model's superior predictive performance compared to all other forecasting models achieving an RMSE of 3.305. The results also highlight the effectiveness of the proposed stochastic programming model to identify mitigation policies that reduce PM2.5 levels in all cities, with the majority of cities falling below the recommended limit.
空气质量差对公众健康和环境可持续性构成重大威胁。为了降低此类风险,准确的空气质量预测对于制定有效降低污染物水平的干预政策至关重要。虽然过去的研究主要集中在预测空气质量趋势上,但本文提出了一种新颖的“先预测后优化”框架,该框架将机器学习模型与两阶段随机规划模型相结合。我们的方法首先预测细颗粒物(PM2.5)水平,然后在优化模型中利用这些预测结果,为加拿大安大略省的城市确定缓解策略。在预测阶段,我们开发并评估了多种机器学习模型,包括随机森林、XGBoost、长短期记忆网络(LSTM)、堆叠长短期记忆网络以及集成架构。这些模型利用气象、野火和历史空气质量数据。然后,将表现最佳的模型的预测结果用作两阶段随机规划模型的输入,该模型在考虑污染物水平不确定性并遵守预算约束的同时,为不同城市选择最优干预政策。大量的计算实验表明,与所有其他预测模型相比,集成模型具有卓越的预测性能,均方根误差(RMSE)为3.305。结果还突出了所提出的随机规划模型在确定降低所有城市PM2.5水平的缓解政策方面的有效性,大多数城市的PM2.5水平降至推荐限值以下。