Verma Sunita, Sharma Ajay, Payra Swagata, Chaudhary Neelam, Mishra Manoj
Institute of Environment and Sustainable Development, Banaras Hindu University, Varanasi, 221105, Uttar Pradesh, India.
Department of Civil Engineering, Indian Institute of Technology Bombay, Mumbai, India.
Environ Sci Pollut Res Int. 2024 Dec;31(58):66372-66387. doi: 10.1007/s11356-024-35564-0. Epub 2024 Dec 3.
In the present work, it is the first time an interpretable machine learning model has been developed for the estimation of Particulate Matter 10 µm (PM) concentrations over India using Aerosol Optical Depth (AOD) from two different satellites, i.e. INSAT-3D and Moderate Resolution Imaging Spectroradiometer (MODIS) for the period of 7 years (2014 to 2020). Ground datasets of AOD are taken from the Aerosol Robotic Network (AERONET) for the validation of satellite-retrieved AOD. The observation of particulate matter (PM) data is acquired from the Central Pollution Control Board (CPCB) station across India. Analysis has been performed on a monthly basis for the given time period. The result shows that AOD products of MODIS exhibit good correlation with AERONET AOD whereas INSAT-3D AOD is not well correlated with AERONET AOD. However, after applying an error envelope and threshold-based filtering technique, we have found that INSAT-3D shows significant correlation with ground-level AOD with approximate correlation of 0.66 for Jaipur and 0.57 for Kanpur exhibiting almost similar performance as MODIS-derived AOD. Satellite AOD data together with ground PM concentration data is used to train the machine learning model (random forest) for the estimation of the PM distribution across India for the year 2020. An encouraging correlation of R-squared (R) value 0.78 has been observed between the estimated and observed PM concentrations. The model demonstrates effective training, mitigating huge overestimation and underestimation. However, despite closely tracking the trends of estimated PM with observed PM, few instances of overestimation persist. This suggests the need for an expanded training dataset to further refine and enhance the model's accuracy. Finally, the machine learning model used for PM estimation is found to be optimal for a calibrated satellite AOD product.
在本研究中,首次开发了一种可解释的机器学习模型,用于利用两颗不同卫星(即INSAT - 3D和中分辨率成像光谱仪(MODIS))的气溶胶光学厚度(AOD)数据,估算印度地区10微米颗粒物(PM)的浓度,数据时间跨度为7年(2014年至2020年)。AOD的地面数据集取自气溶胶机器人网络(AERONET),用于验证卫星反演的AOD。颗粒物(PM)数据观测来自印度各地的中央污染控制委员会(CPCB)站点。在给定时间段内按月进行了分析。结果表明,MODIS的AOD产品与AERONET的AOD具有良好的相关性,而INSAT - 3D的AOD与AERONET的AOD相关性不佳。然而,在应用误差包络和基于阈值的滤波技术后,我们发现INSAT - 3D与地面AOD显示出显著相关性,斋浦尔的近似相关性为0.66,坎普尔为0.57,表现出与MODIS反演的AOD几乎相似的性能。卫星AOD数据与地面PM浓度数据一起用于训练机器学习模型(随机森林),以估算2020年印度全境的PM分布。在估计的和观测到的PM浓度之间观察到令人鼓舞的决定系数(R)值为0.78的相关性。该模型展示了有效的训练,减少了巨大的高估和低估情况。然而,尽管估计的PM与观测到的PM趋势紧密跟踪,但仍存在一些高估情况。这表明需要扩大训练数据集以进一步优化和提高模型的准确性。最后,发现用于PM估计的机器学习模型对于校准后的卫星AOD产品是最优的。