Iyer Hari S, Karasaki Seigi, Yi Li, Hswen Yulin, James Peter, VoPham Trang
Section of Cancer Epidemiology and Health Outcomes, Rutgers Cancer Institute, 120 Albany St. Tower 2, Office 8009, New Brunswick, NJ, 08901, USA.
Epidemiology Program, Public Health Sciences Division, Fred Hutchinson Cancer Center, Seattle, WA, USA.
Curr Environ Health Rep. 2025 Sep 26;12(1):34. doi: 10.1007/s40572-025-00497-4.
Geospatial analysis is an essential tool for research on the role of environmental exposures and health, and critical for understanding impacts of environmental risk factors on diseases with long latency (e.g. cardiovascular disease, dementia, cancers) as well as upstream behaviors including sleep, physical activity, and cognition. There is emerging interest in leveraging machine learning and artificial intelligence (AI) for environmental epidemiology research. In this review, we provide an accessible overview of recent advances.
There have been two major recent shifts in geospatial data types and analytic methods. First, novel methods for statistical prediction, combining geospatial analysis with machine learning and artificial intelligence (GeoAI), allow for scalable geospatial exposure assessment within large population health databases (e.g. cohorts, administrative claims). Second, the widespread adoption of smartphones and wearables with global positioning systems and other sensors has allowed for passive data collection from people, and when combined with geographic information systems, enables exposure assessment at finer spatial scales and temporal resolution than ever before. Illustrative examples include refining models for predicting outdoor air pollution exposure, characterizing populations susceptible to water pollution, and use of deep learning to classify Street View image-derived measures of greenspace. While these tools and approaches may facilitate more rapid, higher quality objective exposure measures, they pose challenges with respect to participant privacy, representativeness of collected data, and curation of high quality validation sets for training of GeoAI algorithms. GeoAI approaches are beginning to be used for environmental exposure assessment and behavioral outcome ascertainment with higher spatial and temporal precision than before. Epidemiologists should continue to apply critical assessment of measurement accuracy and design validity when incorporating these new tools into their work.
地理空间分析是研究环境暴露与健康关系的重要工具,对于理解环境风险因素对潜伏期较长的疾病(如心血管疾病、痴呆症、癌症)以及包括睡眠、身体活动和认知在内的上游行为的影响至关重要。利用机器学习和人工智能(AI)进行环境流行病学研究的兴趣正在兴起。在本综述中,我们提供了近期进展的易懂概述。
近期地理空间数据类型和分析方法发生了两大主要转变。首先,将地理空间分析与机器学习和人工智能相结合的统计预测新方法(地理人工智能,GeoAI),使得在大型人群健康数据库(如队列研究、行政索赔数据)中进行可扩展的地理空间暴露评估成为可能。其次,全球定位系统和其他传感器的智能手机及可穿戴设备的广泛使用,使得能够从人群中被动收集数据,并且与地理信息系统相结合时,能够以比以往更高的空间尺度和时间分辨率进行暴露评估。示例包括改进预测室外空气污染暴露的模型、确定易受水污染影响的人群特征,以及使用深度学习对街景图像衍生的绿地指标进行分类。虽然这些工具和方法可能有助于实现更快速、高质量的客观暴露测量,但它们在参与者隐私、所收集数据的代表性以及用于训练GeoAI算法的高质量验证集的管理方面带来了挑战。GeoAI方法正开始用于环境暴露评估和行为结果确定,其空间和时间精度比以前更高。流行病学家在将这些新工具纳入其工作时,应继续对测量准确性和设计有效性进行严格评估。