Ahmed Maqsood, Zhang Xiang, Shen Yonglin, Ahmed Tanveer, Ali Shahid, Ali Ayaz, Gulakhmadov Aminjon, Nam Won-Ho, Chen Nengcheng
National Engineering Research Center of Geographic Information System, School of Geography and Information Engineering, China University of Geosciences, Wuhan 430074, China.
National Engineering Research Center of Geographic Information System, School of Geography and Information Engineering, China University of Geosciences, Wuhan 430074, China.
Environ Int. 2025 May;199:109496. doi: 10.1016/j.envint.2025.109496. Epub 2025 Apr 26.
Air quality is crucial for both public health and environmental sustainability. An efficient and cost-effective model is essential for accurate air quality predictions and proactive pollution control. However, existing research primarily focuses on single static image analysis, which does not account for the dynamic and temporal nature of air pollution. Meanwhile, research on video-based air quality estimation remains limited, particularly in achieving accurate multi-pollutant outputs. This study proposes Air Quality Prediction-Mamba (AQP-Mamba), a video-based deep learning model that integrates a structured Selective State Space Model (SSM) with a selective scan mechanism and a hybrid predictor (HP) to estimate air quality. The spatiotemporal forward and backward SSM dynamically adjusts parameters based on input, ensures linear complexity, and effectively captures long-range dependencies by bidirectional processing of spatiotemporal features through four scanning techniques (row-wise, column-wise, and their vertical reversals), which allows the model to accurately track pollutant concentrations and air quality variations over time. Thus, the model efficiently extracts spatiotemporal features from video and simultaneously performs regression (PM, PM, and AQI), and classification (AQI) tasks, respectively. A high-quality outdoor hourly air quality dataset (LMSAQV) with 13,176 videos collected from six monitoring stations in Lahore, Pakistan, was utilized as the case study. The experimental results demonstrate that the AQP-Mamba significantly outperforms several state-of-the-art models, including VideoSwin-T, VideoMAE, I3D, VTHCL, andTimeSformer. The proposed model achieves strong regression performance (PM: R = 0.91, PM: R = 0.90, AQI: R = 0.92) and excellent classification metrics: accuracy (94.57 %), precision (93.86 %), recall (94.20 %), and F1-score (93.44 %), respectively. The proposed model delivers consistent, real-time performance with a latency of 1.98 s per video, offering an effective, scalable, and cost-efficient solution for multi-pollutant estimation. This approach has the potential to address gaps in air quality data collected by expensive instruments globally.
空气质量对公众健康和环境可持续性都至关重要。一个高效且经济高效的模型对于准确的空气质量预测和积极的污染控制至关重要。然而,现有研究主要集中在单张静态图像分析上,没有考虑空气污染的动态和时间特性。与此同时,基于视频的空气质量估计研究仍然有限,特别是在实现准确的多污染物输出方面。本研究提出了空气质量预测曼巴(AQP-Mamba),这是一种基于视频的深度学习模型,它将结构化的选择性状态空间模型(SSM)与选择性扫描机制和混合预测器(HP)相结合来估计空气质量。时空向前和向后的SSM根据输入动态调整参数,确保线性复杂度,并通过四种扫描技术(逐行、逐列及其垂直反转)对时空特征进行双向处理,有效地捕捉长程依赖性,这使得模型能够准确跟踪污染物浓度和空气质量随时间的变化。因此,该模型有效地从视频中提取时空特征,并同时分别执行回归(PM、PM和AQI)以及分类(AQI)任务。以从巴基斯坦拉合尔的六个监测站收集的13176个视频组成的高质量室外每小时空气质量数据集(LMSAQV)作为案例研究。实验结果表明,AQP-Mamba显著优于几个最先进的模型,包括VideoSwin-T、VideoMAE、I3D、VTHCL和TimeSformer。所提出的模型分别实现了强大的回归性能(PM:R = 0.91,PM:R = 0.90,AQI:R = 0.92)和出色的分类指标:准确率(94.57%)、精确率(93.86%)、召回率(94.20%)和F1分数(93.44%)。所提出的模型以每个视频1.98秒的延迟提供一致的实时性能,为多污染物估计提供了一种有效、可扩展且经济高效的解决方案。这种方法有可能解决全球范围内昂贵仪器收集的空气质量数据中的差距。