Suppr超能文献

利用机器学习和生存分析优化肝癌的预后预测

Optimizing Prognostic Predictions in Liver Cancer with Machine Learning and Survival Analysis.

作者信息

Cai Kaida, Fu Wenzhi, Wang Zhengyan, Yang Xiaofang, Liu Hanwen, Ji Ziyang

机构信息

Department of Epidemiology and Biostatistics, School of Public Health, Southeast University, Nanjing 210009, China.

Department of Statistics and Actuarial Science, School of Mathematics, Southeast University, Nanjing 211189, China.

出版信息

Entropy (Basel). 2024 Sep 7;26(9):767. doi: 10.3390/e26090767.

Abstract

This study harnesses RNA sequencing data from the Cancer Genome Atlas to unearth pivotal genetic markers linked to the progression of liver hepatocellular carcinoma (LIHC), a major contributor to cancer-related deaths worldwide, characterized by a dire prognosis and limited treatment avenues. We employ advanced feature selection techniques, including sure independence screening (SIS) combined with the least absolute shrinkage and selection operator (Lasso), smoothly clipped absolute deviation (SCAD), information gain (IG), and permutation variable importance (VIMP) methods, to effectively navigate the challenges posed by ultra-high-dimensional data. Through these methods, we identify critical genes like MED8 as significant markers for LIHC. These markers are further analyzed using advanced survival analysis models, including the Cox proportional hazards model, survival tree, and random survival forests. Our findings reveal that SIS-Lasso demonstrates strong predictive accuracy, particularly in combination with the Cox proportional hazards model. However, when coupled with the random survival forests method, the SIS-VIMP approach achieves the highest overall performance. This comprehensive approach not only enhances the prediction of LIHC outcomes but also provides valuable insights into the genetic mechanisms underlying the disease, thereby paving the way for personalized treatment strategies and advancing the field of cancer genomics.

摘要

本研究利用来自癌症基因组图谱的RNA测序数据,以发掘与肝细胞癌(LIHC)进展相关的关键基因标记。肝细胞癌是全球癌症相关死亡的主要原因,预后极差且治疗途径有限。我们采用先进的特征选择技术,包括确定性独立筛选(SIS)与最小绝对收缩与选择算子(Lasso)相结合、平滑截断绝对偏差(SCAD)、信息增益(IG)和排列变量重要性(VIMP)方法,以有效应对超高维数据带来的挑战。通过这些方法,我们确定了MED8等关键基因作为LIHC的重要标记。使用先进的生存分析模型,包括Cox比例风险模型、生存树和随机生存森林,对这些标记进行进一步分析。我们的研究结果表明,SIS-Lasso显示出强大的预测准确性,特别是与Cox比例风险模型结合时。然而,当与随机生存森林方法结合时,SIS-VIMP方法实现了最高的整体性能。这种综合方法不仅提高了对LIHC结果的预测,还为该疾病的潜在遗传机制提供了有价值的见解,从而为个性化治疗策略铺平道路,并推动癌症基因组学领域的发展。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3641/11431161/3c63bb0281fb/entropy-26-00767-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验