• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

套索正则化技术在减轻空气质量预测模型过拟合中的应用。

Application of the Lasso regularisation technique in mitigating overfitting in air quality prediction models.

作者信息

Pak Abbas, Rad Abdullah Kaviani, Nematollahi Mohammad Javad, Mahmoudi Mohammadreza

机构信息

Department of Computer Sciences, Shahrekord University, Shahrekord, Iran.

Department of Environmental Engineering and Natural Resources, College of Agriculture, Shiraz University, Shiraz, 71946-85111, Iran.

出版信息

Sci Rep. 2025 Jan 2;15(1):547. doi: 10.1038/s41598-024-84342-y.

DOI:10.1038/s41598-024-84342-y
PMID:39747344
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11696743/
Abstract

As a significant global concern, air pollution triggers enormous challenges in public health and ecological sustainability, necessitating the development of precise algorithms to forecast and mitigate its impacts, which has led to the development of many machine learning (ML)-based models for predicting air quality. Meanwhile, overfitting is a prevalent issue with ML algorithms that decreases their efficacy and generalizability. The present investigation, using an extensive collection of data from 16 sensors in Tehran, Iran, from 2013 to 2023, focuses on applying the Least Absolute Shrinkage and Selection Operator (Lasso) regularisation technique to enhance the forecasting precision of ambient air pollutants concentration models, including particulate matter (PM and PM), CO, NO, SO, and O while decreasing overfitting. The outputs were compared using the R-squared (R), mean absolute error (MAE), mean square error (MSE), root mean square error (RMSE), and normalised mean square error (NMSE) indices. Despite the preliminary findings revealing that Lasso dramatically enhances model reliability by decreasing overfitting and determining key attributes, the model's performance in predicting gaseous pollutants against PM remained unsatisfactory (R = 0.80, R = 0.75, R = 0.45, R = 0.55, R = 0.65, and R = 0.35). The minimal degree of missing data presumably explained the strong performance of the PM model, while the high dynamism of gases and their chemical interactions, in conjunction with the inherent characteristics of the model, were the primary factors contributing to the poor performance of the model. Simultaneously, the successful implementation of the Lasso regularisation approach in mitigating overfitting and selecting more important features makes it highly suggested for application in air quality forecasting models.

摘要

作为一个重大的全球问题,空气污染给公共卫生和生态可持续性带来了巨大挑战,因此需要开发精确的算法来预测和减轻其影响,这促使人们开发了许多基于机器学习(ML)的空气质量预测模型。同时,过拟合是ML算法中普遍存在的问题,会降低其有效性和通用性。本研究使用了来自伊朗德黑兰16个传感器在2013年至2023年期间的大量数据,重点应用最小绝对收缩和选择算子(Lasso)正则化技术来提高环境空气污染物浓度模型的预测精度,包括颗粒物(PM和PM)、一氧化碳(CO)、一氧化氮(NO)、二氧化硫(SO)和臭氧(O),同时减少过拟合。使用决定系数(R²)、平均绝对误差(MAE)、均方误差(MSE)、均方根误差(RMSE)和归一化均方误差(NMSE)指标对输出结果进行比较。尽管初步结果表明Lasso通过减少过拟合和确定关键属性显著提高了模型的可靠性,但该模型在预测气态污染物与颗粒物方面的性能仍不尽人意(R² = 0.80、R² = 0.75、R² = 0.45、R² = 0.55、R² = 0.65和R² = 0.35)。数据缺失程度最小可能解释了颗粒物模型的良好性能,而气体的高动态性及其化学相互作用,再加上模型的固有特性,是导致该模型性能不佳的主要因素。同时,Lasso正则化方法在减轻过拟合和选择更重要特征方面的成功实施,使其强烈建议应用于空气质量预测模型。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f00/11696743/f37f77cbdd19/41598_2024_84342_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f00/11696743/6a6d23934acd/41598_2024_84342_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f00/11696743/dca53233213d/41598_2024_84342_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f00/11696743/7c150ec4071e/41598_2024_84342_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f00/11696743/aa9fc5656cc2/41598_2024_84342_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f00/11696743/23431bec757f/41598_2024_84342_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f00/11696743/6e206c467f8d/41598_2024_84342_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f00/11696743/78435baea267/41598_2024_84342_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f00/11696743/f37f77cbdd19/41598_2024_84342_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f00/11696743/6a6d23934acd/41598_2024_84342_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f00/11696743/dca53233213d/41598_2024_84342_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f00/11696743/7c150ec4071e/41598_2024_84342_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f00/11696743/aa9fc5656cc2/41598_2024_84342_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f00/11696743/23431bec757f/41598_2024_84342_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f00/11696743/6e206c467f8d/41598_2024_84342_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f00/11696743/78435baea267/41598_2024_84342_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f00/11696743/f37f77cbdd19/41598_2024_84342_Fig8_HTML.jpg

相似文献

1
Application of the Lasso regularisation technique in mitigating overfitting in air quality prediction models.套索正则化技术在减轻空气质量预测模型过拟合中的应用。
Sci Rep. 2025 Jan 2;15(1):547. doi: 10.1038/s41598-024-84342-y.
2
Predictive modeling of air quality in the Tehran megacity via deep learning techniques.通过深度学习技术对德黑兰大城市空气质量进行预测建模。
Sci Rep. 2025 Jan 8;15(1):1367. doi: 10.1038/s41598-024-84550-6.
3
Prophet forecasting model: a machine learning approach to predict the concentration of air pollutants (PM, PM, O, NO, SO, CO) in Seoul, South Korea.先知预测模型:一种用于预测韩国首尔空气污染物(颗粒物、细颗粒物、臭氧、一氧化氮、二氧化硫、一氧化碳)浓度的机器学习方法。
PeerJ. 2020 Sep 15;8:e9961. doi: 10.7717/peerj.9961. eCollection 2020.
4
Performance analysis of machine learning models for AQI prediction in Gorakhpur City: a critical study.机器学习模型在戈勒克布尔市空气质量指数预测中的性能分析:一项批判性研究。
Environ Monit Assess. 2024 Sep 12;196(10):924. doi: 10.1007/s10661-024-13107-x.
5
A land use regression model using machine learning and locally developed low cost particulate matter sensors in Uganda.乌干达使用机器学习和本地开发的低成本颗粒物传感器的土地利用回归模型。
Environ Res. 2021 Aug;199:111352. doi: 10.1016/j.envres.2021.111352. Epub 2021 May 24.
6
The influence of improved air quality on mortality risks in Erfurt, Germany.德国爱尔福特空气质量改善对死亡风险的影响。
Res Rep Health Eff Inst. 2009 Feb(137):5-77; discussion 79-90.
7
A hybrid air quality early-warning framework: An hourly forecasting model with online sequential extreme learning machines and empirical mode decomposition algorithms.一种混合空气质量预警框架:基于在线序贯极端学习机和经验模态分解算法的逐时预测模型。
Sci Total Environ. 2020 Mar 20;709:135934. doi: 10.1016/j.scitotenv.2019.135934. Epub 2019 Dec 10.
8
Effects of short-term exposure to air pollution on hospital admissions of young children for acute lower respiratory infections in Ho Chi Minh City, Vietnam.越南胡志明市短期暴露于空气污染对幼儿急性下呼吸道感染住院率的影响。
Res Rep Health Eff Inst. 2012 Jun(169):5-72; discussion 73-83.
9
Assessing the ambient air quality patterns associated to the COVID-19 outbreak in the Yangtze River Delta: A random forest approach.评估与长江三角洲 COVID-19 爆发相关的环境空气质量模式:随机森林方法。
Chemosphere. 2023 Feb;314:137638. doi: 10.1016/j.chemosphere.2022.137638. Epub 2022 Dec 21.
10
Evaluation of machine learning and deep learning models for daily air quality index prediction in Delhi city, India.评估机器学习和深度学习模型在印度德里市的每日空气质量指数预测中的应用。
Environ Monit Assess. 2024 Nov 19;196(12):1215. doi: 10.1007/s10661-024-13351-1.

引用本文的文献

1
Artificial Intelligence in cancer epigenomics: a review on advances in pan-cancer detection and precision medicine.癌症表观基因组学中的人工智能:泛癌检测与精准医学进展综述
Epigenetics Chromatin. 2025 Jun 14;18(1):35. doi: 10.1186/s13072-025-00595-5.
2
Predicting Radiation Esophagitis in Patients Undergoing Synchronous Boost Radiotherapy Post-Breast-Conserving Surgery.保乳手术后同步加量放疗患者放射性食管炎的预测
Dose Response. 2025 Apr 15;23(2):15593258251335802. doi: 10.1177/15593258251335802. eCollection 2025 Apr-Jun.
3
A Hybrid Wavelet-Based Deep Learning Model for Accurate Prediction of Daily Surface PM Concentrations in Guangzhou City.

本文引用的文献

1
[Predictive Model for O in Shanghai Based on the KZ Filtering Technique and LSTM].基于KZ滤波技术和长短期记忆网络的上海O预测模型
Huan Jing Ke Xue. 2024 Oct 8;45(10):5729-5739. doi: 10.13227/j.hjkx.202311150.
2
Monthly climate prediction using deep convolutional neural network and long short-term memory.使用深度卷积神经网络和长短期记忆进行月度气候预测。
Sci Rep. 2024 Jul 31;14(1):17748. doi: 10.1038/s41598-024-68906-6.
3
Artificial intelligence-assisted air quality monitoring for smart city management.用于智慧城市管理的人工智能辅助空气质量监测。
一种基于混合小波的深度学习模型用于精确预测广州市每日地表颗粒物浓度
Toxics. 2025 Mar 28;13(4):254. doi: 10.3390/toxics13040254.
PeerJ Comput Sci. 2023 May 24;9:e1306. doi: 10.7717/peerj-cs.1306. eCollection 2023.
4
PM2.5 Concentration Prediction Model: A CNN-RF Ensemble Framework.PM2.5 浓度预测模型:CNN-RF 集成框架。
Int J Environ Res Public Health. 2023 Feb 24;20(5):4077. doi: 10.3390/ijerph20054077.
5
Machine learning algorithms to forecast air quality: a survey.用于预测空气质量的机器学习算法:一项综述。
Artif Intell Rev. 2023 Feb 16:1-36. doi: 10.1007/s10462-023-10424-4.
6
Predicting of Daily PM Concentration Employing Wavelet Artificial Neural Networks Based on Meteorological Elements in Shanghai, China.基于气象要素的小波人工神经网络预测中国上海每日细颗粒物浓度
Toxics. 2023 Jan 3;11(1):51. doi: 10.3390/toxics11010051.
7
Machine learning methods to predict particulate matter PM .机器学习方法预测颗粒物 PM 。
F1000Res. 2022 Apr 11;11:406. doi: 10.12688/f1000research.73166.1. eCollection 2022.
8
Association and interaction of O and NO with emergency room visits for respiratory diseases in Beijing, China: a time-series study.中国北京地区呼吸疾病急诊就诊与 O 和 NO 的关联和交互作用:一项时间序列研究。
BMC Public Health. 2022 Dec 5;22(1):2265. doi: 10.1186/s12889-022-14473-2.
9
Pollution and health: a progress update.污染与健康:进展更新。
Lancet Planet Health. 2022 Jun;6(6):e535-e547. doi: 10.1016/S2542-5196(22)00090-0. Epub 2022 May 18.
10
Climate Change, Environmental Disasters, and Health Inequities: The Underlying Role of Structural Inequalities.气候变化、环境灾害与健康不平等:结构性不平等的潜在作用。
Curr Environ Health Rep. 2022 Mar;9(1):80-89. doi: 10.1007/s40572-022-00336-w. Epub 2022 Mar 26.