• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

2022 年美国成年人当前电子烟使用预测中的特征选择和机器学习方法。

Feature Selection and Machine Learning Approaches in Prediction of Current E-Cigarette Use Among U.S. Adults in 2022.

机构信息

West Virginia Clinical and Translational Science Institute, Morgantown, WV 26506, USA.

Department of Biostatistics and Epidemiology, College of Public Health, East Tennessee State University, Johnson City, TN 37614, USA.

出版信息

Int J Environ Res Public Health. 2024 Nov 6;21(11):1474. doi: 10.3390/ijerph21111474.

DOI:10.3390/ijerph21111474
PMID:39595741
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11594230/
Abstract

Feature selection is essentially the process of picking informative and relevant features from a larger collection of features. Few studies have focused on predictors for current e-cigarette use among U.S. adults using feature selection and machine learning (ML) approaches. This study aimed to perform feature selection and develop ML approaches in prediction of current e-cigarette use using the 2022 Health Information National Trends Survey (HINTS 6). The Boruta algorithm and the least absolute shrinkage and selection operator (LASSO) were used to perform feature selection of 71 variables. The random oversampling example (ROSE) method was utilized to deal with imbalance data. Five ML tools including support vector machines (SVMs), logistic regression (LR), random forest (RF), gradient boosting machine (GBM), and extreme gradient boosting (XGBoost) were applied to develop ML models. The overall prevalence of current e-cigarette use was 4.3%. Using the overlapped 15 variables selected by Boruta and LASSO, the RF algorithm provided the best classifier with an accuracy of 0.992, sensitivity of 0.985, F1 score of 0.991, and AUC of 0.999. Weighted logistic regression further confirmed that age, education level, smoking status, belief in the harm of e-cigarette use, binge drinking, belief in alcohol increasing cancer, and the Patient Health Questionnaire-4 (PHQ4) score were associated with e-cigarette use. This study confirmed the strength of ML techniques in survey data, and the findings will guide inquiry into behaviors and mentalities of substance users.

摘要

特征选择本质上是从大量特征中挑选信息丰富且相关的特征的过程。很少有研究使用特征选择和机器学习 (ML) 方法来关注美国成年人当前电子烟使用的预测因子。本研究旨在使用 2022 年健康信息国家趋势调查 (HINTS 6) 通过特征选择和开发 ML 方法来预测当前电子烟的使用情况。Boruta 算法和最小绝对值收缩和选择算子 (LASSO) 用于对 71 个变量进行特征选择。随机过采样示例 (ROSE) 方法用于处理不平衡数据。使用支持向量机 (SVMs)、逻辑回归 (LR)、随机森林 (RF)、梯度提升机 (GBM) 和极端梯度提升 (XGBoost) 这 5 种 ML 工具来开发 ML 模型。当前电子烟使用率为 4.3%。使用 Boruta 和 LASSO 选择的重叠 15 个变量,RF 算法提供了最佳分类器,准确率为 0.992、灵敏度为 0.985、F1 得分为 0.991、AUC 为 0.999。加权逻辑回归进一步证实,年龄、教育程度、吸烟状况、对电子烟使用危害的信念、狂饮、对酒精增加癌症的信念和患者健康问卷-4 (PHQ4) 评分与电子烟使用相关。本研究证实了 ML 技术在调查数据中的强大功能,研究结果将指导对物质使用者行为和心态的探究。

相似文献

1
Feature Selection and Machine Learning Approaches in Prediction of Current E-Cigarette Use Among U.S. Adults in 2022.2022 年美国成年人当前电子烟使用预测中的特征选择和机器学习方法。
Int J Environ Res Public Health. 2024 Nov 6;21(11):1474. doi: 10.3390/ijerph21111474.
2
Machine Learning-Based Prediction of Binge Drinking among Adults in the United State: Analysis of the 2022 Health Information National Trends Survey.基于机器学习的美国成年人暴饮行为预测:2022年健康信息国家趋势调查分析
Proc 2024 9th Int Conf Math Artif Intell (2024). 2024 May;2024:1-10. doi: 10.1145/3670085.3670090. Epub 2024 Aug 22.
3
Prediction and feature selection of low birth weight using machine learning algorithms.利用机器学习算法预测和选择低出生体重。
J Health Popul Nutr. 2024 Oct 12;43(1):157. doi: 10.1186/s41043-024-00647-8.
4
Factors Associated with E-Cigarette Use in U.S. Young Adult Never Smokers of Conventional Cigarettes: A Machine Learning Approach.与美国传统香烟非吸烟者使用电子烟相关的因素:一种机器学习方法。
Int J Environ Res Public Health. 2020 Oct 5;17(19):7271. doi: 10.3390/ijerph17197271.
5
Machine learning algorithms for predicting COVID-19 mortality in Ethiopia.用于预测埃塞俄比亚 COVID-19 死亡率的机器学习算法。
BMC Public Health. 2024 Jun 28;24(1):1728. doi: 10.1186/s12889-024-19196-0.
6
Development of a machine learning model related to explore the association between heavy metal exposure and alveolar bone loss among US adults utilizing SHAP: a study based on NHANES 2015-2018.利用SHAP开发一种机器学习模型,以探索美国成年人中重金属暴露与牙槽骨丧失之间的关联:一项基于2015 - 2018年美国国家健康与营养检查调查(NHANES)的研究。
BMC Public Health. 2025 Feb 4;25(1):455. doi: 10.1186/s12889-025-21658-y.
7
Optimizing prognostic factors of five-year survival in gastric cancer patients using feature selection techniques with machine learning algorithms: a comparative study.使用机器学习算法进行特征选择技术优化胃癌患者五年生存率的预后因素:一项比较研究。
BMC Med Inform Decis Mak. 2023 Apr 6;23(1):54. doi: 10.1186/s12911-023-02154-y.
8
Can Predictive Modeling Tools Identify Patients at High Risk of Prolonged Opioid Use After ACL Reconstruction?预测模型工具能否识别 ACL 重建术后阿片类药物使用时间延长的高风险患者?
Clin Orthop Relat Res. 2020 Jul;478(7):0-1618. doi: 10.1097/CORR.0000000000001251.
9
Identifying determinants of malnutrition in under-five children in Bangladesh: insights from the BDHS-2022 cross-sectional study.确定孟加拉国五岁以下儿童营养不良的决定因素:来自2022年孟加拉国人口与健康调查横断面研究的见解
Sci Rep. 2025 Apr 24;15(1):14336. doi: 10.1038/s41598-025-99288-y.
10
Interpretable machine learning-based prediction of 28-day mortality in ICU patients with sepsis: a multicenter retrospective study.基于可解释机器学习的脓毒症重症监护病房患者28天死亡率预测:一项多中心回顾性研究
Front Cell Infect Microbiol. 2025 Jan 8;14:1500326. doi: 10.3389/fcimb.2024.1500326. eCollection 2024.