Suppr超能文献

基于LightGBM模型的肺腺癌患者免疫相关基因筛选及生存预测

[Screening of immune related gene and survival prediction of lung adenocarcinoma patients based on LightGBM model].

作者信息

Meng Xiangfu, Tian Youfa, Zhang Xiaoyan

机构信息

School of Electronics and Information Engineering, Liaoning Technical University, Huludao, Liaoning 125000, P. R. China.

出版信息

Sheng Wu Yi Xue Gong Cheng Xue Za Zhi. 2024 Feb 25;41(1):70-79. doi: 10.7507/1001-5515.202305038.

Abstract

Lung cancer is one of the malignant tumors with the greatest threat to human health, and studies have shown that some genes play an important regulatory role in the occurrence and development of lung cancer. In this paper, a LightGBM ensemble learning method is proposed to construct a prognostic model based on immune relate gene (IRG) profile data and clinical data to predict the prognostic survival rate of lung adenocarcinoma patients. First, this method used the Limma package for differential gene expression, used CoxPH regression analysis to screen the IRG to prognosis, and then used XGBoost algorithm to score the importance of the IRG features. Finally, the LASSO regression analysis was used to select IRG that could be used to construct a prognostic model, and a total of 17 IRG features were obtained that could be used to construct model. LightGBM was trained according to the IRG screened. The K-means algorithm was used to divide the patients into three groups, and the area under curve (AUC) of receiver operating characteristic (ROC) of the model output showed that the accuracy of the model in predicting the survival rates of the three groups of patients was 96%, 98% and 96%, respectively. The experimental results show that the model proposed in this paper can divide patients with lung adenocarcinoma into three groups [5-year survival rate higher than 65% (group 1), lower than 65% but higher than 30% (group 2) and lower than 30% (group 3)] and can accurately predict the 5-year survival rate of lung adenocarcinoma patients.

摘要

肺癌是对人类健康威胁最大的恶性肿瘤之一,研究表明一些基因在肺癌的发生发展中起重要调节作用。本文提出一种LightGBM集成学习方法,基于免疫相关基因(IRG)谱数据和临床数据构建预后模型,以预测肺腺癌患者的预后生存率。首先,该方法使用Limma软件包进行差异基因表达分析,采用CoxPH回归分析筛选与预后相关的IRG,然后使用XGBoost算法对IRG特征的重要性进行评分。最后,使用LASSO回归分析选择可用于构建预后模型的IRG,共获得17个可用于构建模型的IRG特征。根据筛选出的IRG对LightGBM进行训练。使用K-means算法将患者分为三组,模型输出的受试者工作特征曲线(ROC)下面积(AUC)表明,该模型预测三组患者生存率的准确率分别为96%、98%和96%。实验结果表明,本文提出的模型可将肺腺癌患者分为三组[5年生存率高于65%(第1组)、低于65%但高于30%(第2组)和低于30%(第3组)],并能准确预测肺腺癌患者的5年生存率。

相似文献

1
[Screening of immune related gene and survival prediction of lung adenocarcinoma patients based on LightGBM model].
Sheng Wu Yi Xue Gong Cheng Xue Za Zhi. 2024 Feb 25;41(1):70-79. doi: 10.7507/1001-5515.202305038.
2
[Construction and Validation of Prognostic Risk Score Model of Autophagy Related Genes in Lung Adenocarcinoma].
Zhongguo Fei Ai Za Zhi. 2021 Aug 20;24(8):557-566. doi: 10.3779/j.issn.1009-3419.2021.103.09. Epub 2021 Jul 14.
3
[Construction of Lung Adenocarcinoma Prognosis Model and Drug Sensitivity Analysis Based on Cuproptosis Related Genes].
Zhongguo Fei Ai Za Zhi. 2023 Aug 20;26(8):591-604. doi: 10.3779/j.issn.1009-3419.2023.102.31.
4
[Bioinformatic analysis of prognostic metabolism-related genes in lung adenocarcinoma].
Xi Bao Yu Fen Zi Mian Yi Xue Za Zhi. 2023 Jan;39(1):41-48.
5
A novel ferroptosis-related genes model for prognosis prediction of lung adenocarcinoma.
BMC Pulm Med. 2021 Jul 13;21(1):229. doi: 10.1186/s12890-021-01588-2.
7
A novel glycosylation-related gene signature predicts survival in patients with lung adenocarcinoma.
BMC Bioinformatics. 2022 Dec 27;23(1):562. doi: 10.1186/s12859-022-05109-8.
9
Establishment and validation of an immune-associated signature in lung adenocarcinoma.
Int Immunopharmacol. 2020 Nov;88:106867. doi: 10.1016/j.intimp.2020.106867. Epub 2020 Aug 13.
10
Construction and analysis of a novel ferroptosis-related gene signature predicting prognosis in lung adenocarcinoma.
FEBS Open Bio. 2021 Nov;11(11):3005-3018. doi: 10.1002/2211-5463.13288. Epub 2021 Oct 16.

本文引用的文献

2
Predicting Characteristics Associated with Breast Cancer Survival Using Multiple Machine Learning Approaches.
Comput Math Methods Med. 2022 Apr 25;2022:1249692. doi: 10.1155/2022/1249692. eCollection 2022.
3
A gradient tree boosting and network propagation derived pan-cancer survival network of the tumor microenvironment.
iScience. 2021 Dec 11;25(1):103617. doi: 10.1016/j.isci.2021.103617. eCollection 2022 Jan 21.
5
Lung cancer.
Lancet. 2021 Aug 7;398(10299):535-554. doi: 10.1016/S0140-6736(21)00312-3. Epub 2021 Jul 21.
6
Deep learning with multimodal representation for pancancer prognosis prediction.
Bioinformatics. 2019 Jul 15;35(14):i446-i454. doi: 10.1093/bioinformatics/btz342.
7
The chromatin accessibility landscape of primary human cancers.
Science. 2018 Oct 26;362(6413). doi: 10.1126/science.aav1898.
8
Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries.
CA Cancer J Clin. 2018 Nov;68(6):394-424. doi: 10.3322/caac.21492. Epub 2018 Sep 12.
9
Machine learning: applications of artificial intelligence to imaging and diagnosis.
Biophys Rev. 2019 Feb;11(1):111-118. doi: 10.1007/s12551-018-0449-9. Epub 2018 Sep 4.
10
Clinical significance of blood-based miRNAs as biomarkers of non-small cell lung cancer.
Oncol Lett. 2018 Jun;15(6):8915-8925. doi: 10.3892/ol.2018.8469. Epub 2018 Apr 12.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验