Lu Tianjun, Kim Sun-Young, Marshall Julian D
Department of Epidemiology and Environmental Health, College of Public Health, University of Kentucky, Lexington, Kentucky, USA.
Department of Cancer AI and Digital Health, Graduate School of Cancer Science and Policy, National Cancer Center, Goyang-si, Gyeonggi-do, Korea.
Geosci Data J. 2025 Apr;12(2). doi: 10.1002/gdj3.70005. Epub 2025 Apr 7.
Concentration estimates for ambient air pollution are used widely in fields such as environmental epidemiology, health impact assessment, urban planning, environmental equity and sustainability. This study builds on previous efforts by developing an updated high-resolution geospatial database of population-weighted annual-average concentrations for six criteria air pollutants (PM, PM, CO, NO, SO, O) across the contiguous U.S. during a five-year period (2016-2020). We developed Land Use Regression (LUR) models within a partial-least-squares-universal kriging framework by incorporating several land use, geospatial and satellite-based predictor variables. The LUR models were validated using conventional and clustered cross-validation, with the former consistently showing superior performance in capturing the variability of air quality. Most models demonstrated reliable performance (e.g., mean squared error-based > 0.8, standardised root mean squared error < 0.1). We used the best modelling approach to develop estimates by Census Block, which were then population-weighted averaged at Census Block Group, Census Tract and County geographies. Our database provides valuable insights into the dynamics of air pollution, with utility for environmental risk assessment, public health, policy and urban planning.
环境空气污染浓度估计在环境流行病学、健康影响评估、城市规划、环境公平性和可持续性等领域广泛应用。本研究基于此前的工作成果,开发了一个更新的高分辨率地理空间数据库,该数据库涵盖了美国本土连续五年(2016 - 2020年)期间六种标准空气污染物(颗粒物、一氧化碳、二氧化氮、二氧化硫、臭氧)的人口加权年平均浓度。我们通过纳入多个土地利用、地理空间和基于卫星的预测变量,在偏最小二乘通用克里金框架内开发了土地利用回归(LUR)模型。LUR模型采用传统交叉验证和聚类交叉验证进行验证,前者在捕捉空气质量变化方面始终表现出更优的性能。大多数模型表现出可靠的性能(例如,基于均方误差的 > 0.8,标准化均方根误差 < 0.1)。我们使用最佳建模方法按普查街区进行估计,然后在普查街区组、普查区和县地理层面进行人口加权平均。我们的数据库为空气污染动态提供了有价值的见解,可用于环境风险评估、公共卫生、政策制定和城市规划。