Suppr超能文献

用于发现生存结局预后生物标志物的统计和机器学习方法。

Statistical and Machine Learning Methods for Discovering Prognostic Biomarkers for Survival Outcomes.

作者信息

Yao Sijie, Wang Xuefeng

机构信息

Department of Biostatistics and Bioinformatics, H. Lee Moffitt Cancer Center & Research Institute, Tampa, FL, USA.

出版信息

Methods Mol Biol. 2023;2629:11-21. doi: 10.1007/978-1-0716-2986-4_2.

Abstract

Discovering molecular biomarkers for predicting patient survival outcomes is an essential step toward improving prognosis and therapeutic decision-making in the treatment of severe diseases such as cancer. Due to the high-dimensionality nature of omics datasets, statistical methods such as the least absolute shrinkage and selection operator (Lasso) have been widely applied for cancer biomarker discovery. Due to their scalability and demonstrated prediction performance, machine learning methods such as XGBoost and neural network models have also been gaining popularity in the community recently. However, compared to more traditional survival methods such as Kaplan-Meier and Cox regression methods, high-dimensional methods for survival outcomes are still less well known to biomedical researchers. In this chapter, we will discuss the key analytical procedures in employing these methods for identifying biomarkers associated with survival data. We will also identify important considerations that emerged from the analysis of actual omics data. Some typical instances of misapplication and misinterpretation of machine learning methods will also be discussed. Using lung cancer and head and neck cancer datasets as demonstrations, we provide step-by-step instructions and sample R codes for prioritizing prognostic biomarkers.

摘要

发现用于预测患者生存结果的分子生物标志物是改善癌症等严重疾病预后和治疗决策的关键一步。由于组学数据集具有高维性,统计方法如最小绝对收缩和选择算子(Lasso)已被广泛应用于癌症生物标志物的发现。由于其可扩展性和已证明的预测性能,机器学习方法如XGBoost和神经网络模型最近在该领域也越来越受欢迎。然而,与更传统的生存方法如Kaplan-Meier法和Cox回归方法相比,用于生存结果的高维方法对生物医学研究人员来说仍然不太为人所知。在本章中,我们将讨论使用这些方法识别与生存数据相关的生物标志物的关键分析程序。我们还将确定从实际组学数据分析中出现的重要注意事项。还将讨论机器学习方法一些典型的误用和误解情况。以肺癌和头颈癌数据集为例,我们提供了用于确定预后生物标志物优先级的分步说明和示例R代码。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验