• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用机器学习方法从单核苷酸多态性预测吸烟行为

Prediction of Smoking Behavior From Single Nucleotide Polymorphisms With Machine Learning Approaches.

作者信息

Xu Yi, Cao Liyu, Zhao Xinyi, Yao Yinghao, Liu Qiang, Zhang Bin, Wang Yan, Mao Ying, Ma Yunlong, Ma Jennie Z, Payne Thomas J, Li Ming D, Li Lanjuan

机构信息

State Key Laboratory for Diagnosis and Treatment of Infectious Diseases, National Clinical Research Center for Infectious Diseases, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China.

Department of Public Health Sciences, University of Virginia, Charlottesville, VA, United States.

出版信息

Front Psychiatry. 2020 May 14;11:416. doi: 10.3389/fpsyt.2020.00416. eCollection 2020.

DOI:10.3389/fpsyt.2020.00416
PMID:32477189
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7241440/
Abstract

Smoking is a complex behavior with a heritability as high as 50%. Given such a large genetic contribution, it provides an opportunity to prevent those individuals who are susceptible to smoking dependence from ever starting to smoke by predicting their inherited predisposition with their genomic profiles. Although previous studies have identified many susceptibility variants for smoking, they have limited power to predict smoking behavior. We applied the support vector machine (SVM) and random forest (RF) methods to build prediction models for smoking behavior. We first used 1,431 smokers and 1,503 non-smokers of African origin for model building with a 10-fold cross-validation and then tested the prediction models on an independent dataset consisting of 213 smokers and 224 non-smokers. The SVM model with 500 top single nucleotide polymorphisms (SNPs) selected using logistic regression (p<0.01) as the feature selection method achieved an area under the curve (AUC) of 0.691, 0.721, and 0.720 for the training, test, and independent test samples, respectively. The RF model with 500 top SNPs selected using logistic regression (p<0.01) achieved AUCs of 0.671, 0.665, and 0.667 for the training, test, and independent test samples, respectively. Finally, we used the combined logistic (p<0.01) and LASSO (λ=10) regression to select features and the SVM algorithm for model building. The SVM model with 500 top SNPs achieved AUCs of 0.756, 0.776, and 0.897 for the training, test, and independent test samples, respectively. We conclude that machine learning methods are promising means to build predictive models for smoking.

摘要

吸烟是一种复杂行为,其遗传度高达50%。鉴于如此大的遗传贡献,通过利用基因组图谱预测个体的遗传易感性,为预防那些易患吸烟依赖的人开始吸烟提供了一个机会。尽管先前的研究已经鉴定出许多吸烟易感性变异,但它们预测吸烟行为的能力有限。我们应用支持向量机(SVM)和随机森林(RF)方法构建吸烟行为预测模型。我们首先使用1431名非洲裔吸烟者和1503名非洲裔非吸烟者进行模型构建,并进行10倍交叉验证,然后在由213名吸烟者和224名非吸烟者组成的独立数据集上测试预测模型。使用逻辑回归(p<0.01)作为特征选择方法选择的500个顶级单核苷酸多态性(SNP)构建的SVM模型,训练样本、测试样本和独立测试样本的曲线下面积(AUC)分别为0.691、0.721和0.720。使用逻辑回归(p<0.01)选择的500个顶级SNP构建的RF模型,训练样本、测试样本和独立测试样本的AUC分别为0.671、0.665和0.667。最后,我们使用联合逻辑回归(p<0.01)和LASSO(λ=10)回归进行特征选择,并使用SVM算法进行模型构建。使用500个顶级SNP构建的SVM模型,训练样本、测试样本和独立测试样本的AUC分别为0.756、0.776和0.897。我们得出结论,机器学习方法是构建吸烟预测模型的有前景的手段。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b037/7241440/edd0b30e0903/fpsyt-11-00416-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b037/7241440/c4d05eec0599/fpsyt-11-00416-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b037/7241440/229ef49410d1/fpsyt-11-00416-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b037/7241440/0e9f59f3214b/fpsyt-11-00416-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b037/7241440/edd0b30e0903/fpsyt-11-00416-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b037/7241440/c4d05eec0599/fpsyt-11-00416-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b037/7241440/229ef49410d1/fpsyt-11-00416-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b037/7241440/0e9f59f3214b/fpsyt-11-00416-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b037/7241440/edd0b30e0903/fpsyt-11-00416-g004.jpg

相似文献

1
Prediction of Smoking Behavior From Single Nucleotide Polymorphisms With Machine Learning Approaches.利用机器学习方法从单核苷酸多态性预测吸烟行为
Front Psychiatry. 2020 May 14;11:416. doi: 10.3389/fpsyt.2020.00416. eCollection 2020.
2
Combining handcrafted features with latent variables in machine learning for prediction of radiation-induced lung damage.将机器学习中的手工特征与潜在变量相结合,以预测放射性肺损伤。
Med Phys. 2019 May;46(5):2497-2511. doi: 10.1002/mp.13497. Epub 2019 Apr 8.
3
Machine Learning-Based Method for Obesity Risk Evaluation Using Single-Nucleotide Polymorphisms Derived from Next-Generation Sequencing.基于机器学习的肥胖风险评估方法:利用来自下一代测序的单核苷酸多态性
J Comput Biol. 2018 Dec;25(12):1347-1360. doi: 10.1089/cmb.2018.0002. Epub 2018 Sep 8.
4
Comparison of ischemic stroke diagnosis models based on machine learning.基于机器学习的缺血性中风诊断模型比较
Front Neurol. 2022 Dec 5;13:1014346. doi: 10.3389/fneur.2022.1014346. eCollection 2022.
5
Seminal quality prediction using data mining methods.使用数据挖掘方法进行精液质量预测。
Technol Health Care. 2014;22(4):531-45. doi: 10.3233/THC-140816.
6
Phenotype prediction from genome-wide association studies: application to smoking behaviors.基于全基因组关联研究的表型预测:在吸烟行为中的应用
BMC Syst Biol. 2012;6 Suppl 2(Suppl 2):S11. doi: 10.1186/1752-0509-6-S2-S11. Epub 2012 Dec 12.
7
Stroke Prediction with Machine Learning Methods among Older Chinese.基于机器学习方法对中国老年人进行中风预测。
Int J Environ Res Public Health. 2020 Mar 12;17(6):1828. doi: 10.3390/ijerph17061828.
8
Derivation and validation of different machine-learning models in mortality prediction of trauma in motorcycle riders: a cross-sectional retrospective study in southern Taiwan.不同机器学习模型在摩托车骑士创伤死亡率预测中的推导与验证:台湾南部的一项横断面回顾性研究
BMJ Open. 2018 Jan 5;8(1):e018252. doi: 10.1136/bmjopen-2017-018252.
9
Development and rigorous validation of antimalarial predictive models using machine learning approaches.采用机器学习方法开发和严格验证抗疟预测模型。
SAR QSAR Environ Res. 2019 Aug;30(8):543-560. doi: 10.1080/1062936X.2019.1635526. Epub 2019 Jul 22.
10
Optimizing the Predictive Ability of Machine Learning Methods for Landslide Susceptibility Mapping Using SMOTE for Lishui City in Zhejiang Province, China.利用 SMOTE 优化机器学习方法在浙江省丽水市滑坡易发性制图中的预测能力。
Int J Environ Res Public Health. 2019 Jan 28;16(3):368. doi: 10.3390/ijerph16030368.

引用本文的文献

1
Debiased lasso after sample splitting for estimation and inference in high-dimensional generalized linear models.用于高维广义线性模型估计和推断的样本拆分后去偏套索法。
Can J Stat. 2025 Mar;53(1). doi: 10.1002/cjs.11827. Epub 2024 Aug 21.
2
Development of a deep learning model to predict smoking status in patients with chronic obstructive pulmonary disease: A secondary analysis of cross-sectional national survey.开发用于预测慢性阻塞性肺疾病患者吸烟状况的深度学习模型:一项全国横断面调查的二次分析
Digit Health. 2025 Apr 15;11:20552076251333660. doi: 10.1177/20552076251333660. eCollection 2025 Jan-Dec.
3
Harnessing machine learning in contemporary tobacco research.

本文引用的文献

1
Decision Variants for the Automatic Determination of Optimal Feature Subset in RF-RFE.用于在随机森林-递归特征消除中自动确定最优特征子集的决策变体
Genes (Basel). 2018 Jun 15;9(6):301. doi: 10.3390/genes9060301.
2
A tutorial on conducting genome-wide association studies: Quality control and statistical analysis.全基因组关联研究教程:质量控制和统计分析。
Int J Methods Psychiatr Res. 2018 Jun;27(2):e1608. doi: 10.1002/mpr.1608. Epub 2018 Feb 27.
3
GWAS-based machine learning approach to predict duloxetine response in major depressive disorder.
当代烟草研究中机器学习的应用
Toxicol Rep. 2024 Dec 19;14:101877. doi: 10.1016/j.toxrep.2024.101877. eCollection 2025 Jun.
4
Smoking Classification Using Novel Plasma Cytokines by implementing Machine Learning and Statistical Methods.通过机器学习和统计方法利用新型血浆细胞因子进行吸烟分类
Proc (Int Conf Comput Sci Comput Intell). 2023 Dec;2023:686-694. doi: 10.1109/csci62032.2023.00118. Epub 2024 Jul 19.
5
Nicotine Motivated Behavior in .尼古丁激发的行为在……中
Int J Mol Sci. 2024 Jan 29;25(3):1634. doi: 10.3390/ijms25031634.
6
Identification of a Novel Functional Non-synonymous Single Nucleotide Polymorphism in Frizzled Class Receptor 6 Gene for Involvement in Depressive Symptoms.卷曲蛋白家族受体6基因中一个参与抑郁症状的新型功能性非同义单核苷酸多态性的鉴定。
Front Mol Neurosci. 2022 Jul 7;15:882396. doi: 10.3389/fnmol.2022.882396. eCollection 2022.
7
Genetic variations analysis for complex brain disease diagnosis using machine learning techniques: opportunities and hurdles.使用机器学习技术进行复杂脑部疾病诊断的基因变异分析:机遇与障碍
PeerJ Comput Sci. 2021 Sep 20;7:e697. doi: 10.7717/peerj-cs.697. eCollection 2021.
8
Prediction of prokaryotic transposases from protein features with machine learning approaches.基于机器学习方法的蛋白质特征预测原核转座酶。
Microb Genom. 2021 Jul;7(7). doi: 10.1099/mgen.0.000611.
基于 GWAS 的机器学习方法预测在重度抑郁症中对度洛西汀的反应。
J Psychiatr Res. 2018 Apr;99:62-68. doi: 10.1016/j.jpsychires.2017.12.009. Epub 2018 Feb 2.
4
Current Cigarette Smoking Among Adults - United States, 2016.2016年美国成年人当前吸烟情况
MMWR Morb Mortal Wkly Rep. 2018 Jan 19;67(2):53-59. doi: 10.15585/mmwr.mm6702a1.
5
Applications of Support Vector Machine (SVM) Learning in Cancer Genomics.支持向量机(SVM)学习在癌症基因组学中的应用。
Cancer Genomics Proteomics. 2018 Jan-Feb;15(1):41-51. doi: 10.21873/cgp.20063.
6
Prevalence of Cigarette Smoking and Nicotine Dependence in Men and Women Residing in Two Provinces in China.中国两个省份男性和女性的吸烟率及尼古丁依赖情况
Front Psychiatry. 2017 Dec 1;8:254. doi: 10.3389/fpsyt.2017.00254. eCollection 2017.
7
An Exome-Wide Association Study Identifies New Susceptibility Loci for Age of Smoking Initiation in African- and European-American Populations.外显子组关联研究鉴定出非洲裔和欧洲裔人群中吸烟起始年龄的新易感基因座。
Nicotine Tob Res. 2019 May 21;21(6):707-713. doi: 10.1093/ntr/ntx262.
8
Genome-wide meta-analysis identifies a novel susceptibility signal at CACNA2D3 for nicotine dependence.全基因组荟萃分析确定了 CACNA2D3 基因上一个新的尼古丁依赖易感性信号。
Am J Med Genet B Neuropsychiatr Genet. 2017 Jul;174(5):557-567. doi: 10.1002/ajmg.b.32540. Epub 2017 Apr 25.
9
Converging findings from linkage and association analyses on susceptibility genes for smoking and other addictions.关于吸烟及其他成瘾易感性基因的连锁分析和关联分析的趋同研究结果。
Mol Psychiatry. 2016 Aug;21(8):992-1008. doi: 10.1038/mp.2016.67. Epub 2016 May 10.
10
Smoking cessation for Chinese men and prevention for women.中国男性戒烟及女性预防
Lancet. 2015 Oct 10;386(10002):1422-3. doi: 10.1016/S0140-6736(15)00416-X.