Zhong Hui, Chen Di, Wang Pengqin, Wang Wenrui, Shen Shaojie, Liu Yonghong, Zhu Meixin
Intelligent Transportation Thrust, Systems Hub, The Hong Kong University of Science and Technology (Guangzhou), Guangzhou 511455, China.
Department of Civil and Environmental Engineering, The Hong Kong University of Science and Technology, Kowloon, Hong Kong 999077, China.
Environ Sci Technol. 2025 Feb 25;59(7):3582-3591. doi: 10.1021/acs.est.4c08380. Epub 2025 Jan 29.
Integrating mobile monitoring data with street view images (SVIs) holds promise for predicting local air pollution. However, algorithms, sampling strategies, and image quality introduce extra errors due to a lack of reliable references that quantify their effects. To bridge this gap, we employed 314 taxis to monitor NO, NO, PM, and PM, and extracted features from ∼382,000 SVIs at multiple angles (0°, 90°, 180°, 270°) and buffer radii (100-500 m). Additionally, three typical machine learning algorithms were compared with SVI-based land-used regression (LUR) model to explore their performances. Generally, machine learning methods outperform linear LUR, with the ranking: random forest > XGBoost > neural network > LUR. Averaging strategy is an effective method to avoid bias of insufficient feature capture. Therefore, the optimal sampling strategy is to integrating multiple viewing angles at a 100-m buffer, which achieved absolute errors mostly less than 2.5 μg/m or ppb. Besides, overexposure, blur, and underexposure led to image misjudgments and incorrect identifications, causing an overestimation of road features and underestimation of human-activity features. These findings enhance understanding and offer valuable support for developing image-based air quality models and other SVI-related research.
将移动监测数据与街景图像(SVI)相结合,有望预测当地空气污染。然而,由于缺乏量化其影响的可靠参考,算法、采样策略和图像质量会引入额外误差。为弥补这一差距,我们使用314辆出租车监测一氧化氮、二氧化氮、细颗粒物和可吸入颗粒物,并从约38.2万个多角度(0°、90°、180°、270°)和缓冲半径(100 - 500米)的街景图像中提取特征。此外,将三种典型的机器学习算法与基于街景图像的土地利用回归(LUR)模型进行比较,以探究它们的性能。总体而言,机器学习方法优于线性LUR,排名为:随机森林 > XGBoost > 神经网络 > LUR。平均策略是避免特征捕捉不足偏差的有效方法。因此,最佳采样策略是在100米缓冲区内整合多个视角,这样能使绝对误差大多小于2.5微克/立方米或十亿分比。此外,过度曝光、模糊和曝光不足会导致图像误判和识别错误,从而高估道路特征并低估人类活动特征。这些发现增进了理解,并为开发基于图像的空气质量模型及其他与街景图像相关的研究提供了有价值的支持。