Suppr超能文献

基于机器学习算法的胶质瘤易感性单核苷酸多态性筛选及风险模型构建。

Screening of glioma susceptibility SNPs and construction of risk models based on machine learning algorithms.

作者信息

Hu Mingjun, Hao Jie, Wei Jie

机构信息

Department of Neurosurgery, Xi'an Central Hospital, Xi'an, 710003, China.

School of Medicine, Northwest University, Xi'an, 710069, China.

出版信息

BMC Neurol. 2025 Jun 5;25(1):243. doi: 10.1186/s12883-025-04262-w.

Abstract

BACKGROUND

Glioma is a common primary malignant brain tumor. This study aimed to develop a predictive model for glioma risk by these screened key SNPs in the Chinese Han population.

METHODS

These 614 participants were randomly assigned to two datasets: a training dataset (217 cases and 213 controls) and a validation dataset (93 cases and 91 controls). Genotyping for 59 SNPs in 35 genes was conducted using Agena MassARRAY platform. Key SNPs associated with glioma susceptibility were identified through LASSO, SVM-RFE algorithm, and likelihood ratio. A nomogram was constructed to predict glioma risk, and its predictive accuracy was evaluated using calibration and ROC curves.

RESULTS

Twelve overlapping SNPs were identified by LASSO (λ = 0.022, 23 SNPs) and SVM-RFE algorithm (Accuracy = 0.6845, 15 SNPs). Moreover, likelihood ratios displayed 9 SNPs associated with glioma risk (p < 0.05). Combining these methods, ultimately 15 SNPs from 59 SNPs were identified as hub SNPs. Nomogram and ROC curves displayed that the model had good prediction performance in the training cohort (AUC = 0.7950) and the validation cohort (AUC = 0.7433), with rs3950296 and rs1317082 emerging as important risk factors.

CONCLUSIONS

We screened fifteen SNPs, especially rs3950296 and rs1317082 that were associated with glioma risk using machine learning and likelihood ratio tests. The predictive nomogram demonstrated good discrimination ability and potential utility for glioma risk prediction.

摘要

背景

胶质瘤是一种常见的原发性恶性脑肿瘤。本研究旨在通过在中国汉族人群中筛选出的关键单核苷酸多态性(SNP)建立胶质瘤风险预测模型。

方法

将这614名参与者随机分为两个数据集:训练数据集(217例病例和213例对照)和验证数据集(93例病例和91例对照)。使用Agena MassARRAY平台对35个基因中的59个SNP进行基因分型。通过套索(LASSO)、支持向量机递归特征消除(SVM-RFE)算法和似然比来识别与胶质瘤易感性相关的关键SNP。构建列线图以预测胶质瘤风险,并使用校准曲线和ROC曲线评估其预测准确性。

结果

通过LASSO(λ = 0.022,23个SNP)和SVM-RFE算法(准确率 = 0.6845,15个SNP)鉴定出12个重叠的SNP。此外,似然比显示9个与胶质瘤风险相关的SNP(p < 0.05)。综合这些方法,最终从59个SNP中确定了15个SNP作为核心SNP。列线图和ROC曲线显示,该模型在训练队列(AUC = 0.7950)和验证队列(AUC = 0.7433)中具有良好的预测性能,rs3950296和rs1317082成为重要的风险因素。

结论

我们使用机器学习和似然比检验筛选出15个与胶质瘤风险相关SNP,特别是rs3950296和rs1317082。预测列线图显示出良好的辨别能力和对胶质瘤风险预测的潜在效用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e4ae/12139121/0d0d3d4f5ca6/12883_2025_4262_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验