• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于 BO-XGBoost-RFE 算法的全球对流层臭氧预测特征选择。

Feature selection for global tropospheric ozone prediction based on the BO-XGBoost-RFE algorithm.

机构信息

School of Computer Science, Liaocheng University, Liaocheng, 252000, China.

School of Statistics and Mathematics, Zhongnan University of Economics and Law, Wuhan, 430073, China.

出版信息

Sci Rep. 2022 Jun 2;12(1):9244. doi: 10.1038/s41598-022-13498-2.

DOI:10.1038/s41598-022-13498-2
PMID:35655087
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9163069/
Abstract

Ozone is one of the most important air pollutants, with significant impacts on human health, regional air quality and ecosystems. In this study, we use geographic information and environmental information of the monitoring site of 5577 regions in the world from 2010 to 2014 as feature input to predict the long-term average ozone concentration of the site. A Bayesian optimization-based XGBoost-RFE feature selection model BO-XGBoost-RFE is proposed, and a variety of machine learning algorithms are used to predict ozone concentration based on the optimal feature subset. Since the selection of the underlying model hyperparameters is involved in the recursive feature selection process, different hyperparameter combinations will lead to differences in the feature subsets selected by the model, so that the feature subsets obtained by the model may not be optimal solutions. We combine the Bayesian optimization algorithm to adjust the parameters of recursive feature elimination based on XGBoost to obtain the optimal parameter combination and the optimal feature subset under the parameter combination. Experiments on long-term ozone concentration prediction on a global scale show that the prediction accuracy of the model after Bayesian optimized XGBoost-RFE feature selection is higher than that based on all features and on feature selection with Pearson correlation. Among the four prediction models, random forest obtained the highest prediction accuracy. The XGBoost prediction model achieved the greatest improvement in accuracy.

摘要

臭氧是最重要的空气污染物之一,对人类健康、区域空气质量和生态系统都有重大影响。在本研究中,我们使用了 2010 年至 2014 年全球 5577 个地区监测站点的地理信息和环境信息作为特征输入,以预测该站点的长期平均臭氧浓度。提出了一种基于贝叶斯优化的 XGBoost-RFE 特征选择模型 BO-XGBoost-RFE,并使用多种机器学习算法基于最优特征子集来预测臭氧浓度。由于递归特征消除过程中涉及基础模型超参数的选择,不同的超参数组合会导致模型选择的特征子集不同,从而使模型获得的特征子集可能不是最优解。我们结合贝叶斯优化算法,基于 XGBoost 调整递归特征消除的参数,以在参数组合下获得最优参数组合和最优特征子集。在全球范围内对长期臭氧浓度进行预测的实验表明,经过贝叶斯优化的 XGBoost-RFE 特征选择后的模型的预测精度高于基于所有特征和基于 Pearson 相关性的特征选择的模型。在这四个预测模型中,随机森林获得了最高的预测精度。XGBoost 预测模型在准确性方面取得了最大的提高。

相似文献

1
Feature selection for global tropospheric ozone prediction based on the BO-XGBoost-RFE algorithm.基于 BO-XGBoost-RFE 算法的全球对流层臭氧预测特征选择。
Sci Rep. 2022 Jun 2;12(1):9244. doi: 10.1038/s41598-022-13498-2.
2
An Efficient Feature Selection Strategy Based on Multiple Support Vector Machine Technology with Gene Expression Data.基于基因表达数据的多支持向量机技术的高效特征选择策略。
Biomed Res Int. 2018 Aug 30;2018:7538204. doi: 10.1155/2018/7538204. eCollection 2018.
3
Robust biomarker screening from gene expression data by stable machine learning-recursive feature elimination methods.基于稳健机器学习-递归特征消除方法的基因表达数据的稳健生物标志物筛选。
Comput Biol Chem. 2022 Oct;100:107747. doi: 10.1016/j.compbiolchem.2022.107747. Epub 2022 Jul 29.
4
Classification and prediction of spinal disease based on the SMOTE-RFE-XGBoost model.基于SMOTE-RFE-XGBoost模型的脊柱疾病分类与预测
PeerJ Comput Sci. 2023 Mar 10;9:e1280. doi: 10.7717/peerj-cs.1280. eCollection 2023.
5
Union With Recursive Feature Elimination: A Feature Selection Framework to Improve the Classification Performance of Multicategory Causes of Death in Colorectal Cancer.基于递归特征消除的特征选择框架,提高结直肠癌多死因分类性能
Lab Invest. 2024 Mar;104(3):100320. doi: 10.1016/j.labinv.2023.100320. Epub 2023 Dec 28.
6
MinE-RFE: determine the optimal subset from RFE by minimizing the subset-accuracy-defined energy.MinE-RFE:通过最小化子集精度定义的能量来确定 RFE 中的最优子集。
Brief Bioinform. 2020 Mar 23;21(2):687-698. doi: 10.1093/bib/bbz021.
7
A Tri-Stage Wrapper-Filter Feature Selection Framework for Disease Classification.三阶段包装器-过滤器特征选择框架用于疾病分类。
Sensors (Basel). 2021 Aug 18;21(16):5571. doi: 10.3390/s21165571.
8
An efficient alpha seeding method for optimized extreme learning machine-based feature selection algorithm.一种用于优化基于极端学习机的特征选择算法的高效 alpha 种子生成方法。
Comput Biol Med. 2021 Jul;134:104505. doi: 10.1016/j.compbiomed.2021.104505. Epub 2021 May 23.
9
Ensemble Feature Learning of Genomic Data Using Support Vector Machine.使用支持向量机的基因组数据集成特征学习
PLoS One. 2016 Jun 15;11(6):e0157330. doi: 10.1371/journal.pone.0157330. eCollection 2016.
10
Improving the estimation of alpine grassland fractional vegetation cover using optimized algorithms and multi-dimensional features.利用优化算法和多维度特征改进高寒草地植被覆盖度估算
Plant Methods. 2021 Sep 17;17(1):96. doi: 10.1186/s13007-021-00796-5.

引用本文的文献

1
Research on Intrusion Detection Method Based on Transformer and CNN-BiLSTM in Internet of Things.基于Transformer和CNN-BiLSTM的物联网入侵检测方法研究
Sensors (Basel). 2025 Apr 25;25(9):2725. doi: 10.3390/s25092725.
2
Assessing the effectiveness of long short-term memory and artificial neural network in predicting daily ozone concentrations in Liaocheng City.评估长短期记忆网络和人工神经网络在预测聊城市每日臭氧浓度方面的有效性。
Sci Rep. 2025 Feb 25;15(1):6798. doi: 10.1038/s41598-025-91329-w.
3
A two-tier feature selection method for predicting mortality risk in ICU patients with acute kidney injury.

本文引用的文献

1
Exploring the potential of machine learning for simulations of urban ozone variability.探索机器学习在城市臭氧变化模拟中的应用潜力。
Sci Rep. 2021 Nov 18;11(1):22513. doi: 10.1038/s41598-021-01824-z.
2
Prediction of the oxidation potential of PM exposures from pollutant composition and sources.基于污染物成分和来源预测 PM 暴露物的氧化势。
Environ Pollut. 2022 Jan 15;293:118492. doi: 10.1016/j.envpol.2021.118492. Epub 2021 Nov 13.
3
Spatiotemporal distributions of surface ozone levels in China from 2005 to 2017: A machine learning approach.
一种用于预测 ICU 急性肾损伤患者死亡风险的两层特征选择方法。
Sci Rep. 2024 Jul 22;14(1):16794. doi: 10.1038/s41598-024-63793-3.
4
AI-enhanced integration of genetic and medical imaging data for risk assessment of Type 2 diabetes.人工智能增强遗传与医学影像数据融合用于 2 型糖尿病风险评估。
Nat Commun. 2024 May 18;15(1):4230. doi: 10.1038/s41467-024-48618-1.
5
A short-term forecasting method for photovoltaic power generation based on the TCN-ECANet-GRU hybrid model.一种基于TCN-ECANet-GRU混合模型的光伏发电短期预测方法。
Sci Rep. 2024 Mar 21;14(1):6744. doi: 10.1038/s41598-024-56751-6.
6
A Multi-Layer Classifier Model XR-KS of Human Activity Recognition for the Problem of Similar Human Activity.多层分类器模型 XR-KS 用于识别相似人类活动的人类活动识别问题。
Sensors (Basel). 2023 Dec 4;23(23):9613. doi: 10.3390/s23239613.
7
Improved Bayesian Optimization Framework for Inverse Thermal Conductivity Based on Transient Plane Source Method.基于瞬态平面热源法的改进贝叶斯导热系数反演优化框架
Entropy (Basel). 2023 Mar 27;25(4):575. doi: 10.3390/e25040575.
2005 年至 2017 年中国地区臭氧浓度的时空分布:一种机器学习方法。
Environ Int. 2020 Sep;142:105823. doi: 10.1016/j.envint.2020.105823. Epub 2020 Jun 7.
4
Gaussian Process Regression Tuned by Bayesian Optimization for Seawater Intrusion Prediction.贝叶斯优化调整的高斯过程回归在海水入侵预测中的应用。
Comput Intell Neurosci. 2019 Jan 17;2019:2859429. doi: 10.1155/2019/2859429. eCollection 2019.
5
Research on air pollutant concentration prediction method based on self-adaptive neuro-fuzzy weighted extreme learning machine.基于自适应神经模糊加权极限学习机的空气污染物浓度预测方法研究。
Environ Pollut. 2018 Oct;241:1115-1127. doi: 10.1016/j.envpol.2018.05.072. Epub 2018 Jun 23.