Wei Feiran, Yang Shijun, Wang Huiying, Zhao Meng, Zhou Jinyi, Shen Xiaobing, Han Renqiang, Fei Gaoqiang
Key Laboratory of Environmental Medicine Engineering, Ministry of Education, School of Public Health, Southeast University, Nanjing, China.
Guangxi Meteorological Observatory, Nanjing, China.
Front Public Health. 2025 May 30;13:1536509. doi: 10.3389/fpubh.2025.1536509. eCollection 2025.
This study investigated association between long-term PM exposure and lung cancer incidence, focusing on Jiangsu Province, China. We aimed to explore the effects of historical PM with time lags and build a prediction model using machine learning methods.
An ecological epidemiology study.
Lung cancer incidence data from Jiangsu Province (2014-2018) were combined with annual PM concentration data from satellite sources for the previous 10 years (lag 0 to lag 9). Correlation and grey correlation analyses were performed to evaluate the lagged relationship between PM exposure and lung cancer incidence. To address the multicollinearity problem in the data, ridge regression, support vector regression, and back propagation artificial neural network were employed. The combined prediction model was constructed using the optimal weighting method.
The incidence of lung cancer was significantly correlated with PM concentration at different historical time points, with the strongest correlation at lag 9. The combined prediction model that integrates multiple prediction methods showed higher accuracy and reliability in predicting lung cancer incidence than a single model.
Long-term exposure to PM especially exposure with a long lag time, is closely related to lung cancer incidence. The integrated machine learning prediction model can be used as a reliable tool to assess the health risks of air pollution.
本研究以中国江苏省为重点,调查长期暴露于细颗粒物(PM)与肺癌发病率之间的关联。我们旨在探讨具有时间滞后的历史PM的影响,并使用机器学习方法建立预测模型。
一项生态流行病学研究。
将江苏省2014 - 2018年的肺癌发病率数据与过去10年(滞后0至滞后9)卫星来源的年度PM浓度数据相结合。进行相关性分析和灰色关联分析,以评估PM暴露与肺癌发病率之间的滞后关系。为解决数据中的多重共线性问题,采用了岭回归、支持向量回归和反向传播人工神经网络。使用最优加权方法构建组合预测模型。
肺癌发病率与不同历史时间点的PM浓度显著相关,在滞后9时相关性最强。整合多种预测方法的组合预测模型在预测肺癌发病率方面比单一模型具有更高的准确性和可靠性。
长期暴露于PM,尤其是长时间滞后的暴露,与肺癌发病率密切相关。集成机器学习预测模型可作为评估空气污染健康风险的可靠工具。