Suppr超能文献

使用临床、转录组学和临床-转录组学数据对肺腺癌患者的多类别结局进行分类:机器学习与多项式模型

Classify multicategory outcome in patients with lung adenocarcinoma using clinical, transcriptomic and clinico-transcriptomic data: machine learning versus multinomial models.

作者信息

Deng Fei, Shen Lanlan, Wang He, Zhang Lanjing

机构信息

School of Electrical and Electronic Engineering, Shanghai Institute of Technology Shanghai, China.

Department of Pediatrics, Baylor College of Medicine, USDA/ARS Children's Nutrition Research Center Houston, TX, USA.

出版信息

Am J Cancer Res. 2020 Dec 1;10(12):4624-4639. eCollection 2020.

Abstract

Classification of multicategory survival-outcome is important for precision oncology. Machine learning (ML) algorithms have been used to accurately classify multi-category survival-outcome of some cancer-types, but not yet that of lung adenocarcinoma. Therefore, we compared the performances of 3 ML models (random forests, support vector machine [SVM], multilayer perceptron) and multinomial logistic regression (Mlogit) models for classifying 4-category survival-outcome of lung adenocarcinoma using the TCGA. Mlogit model overall performed similar to SVM and multilayer perceptron models (micro-average area under curve=0.82), while random forests model was inferior. Surprisingly, transcriptomic data alone and clinico-transcriptomic data appeared sufficient to accurately classify the 4-category survival-outcome in these patients, but no models using clinical data alone performed well. Notably, , and were the top-ranked genes that were associated with alive without disease and inversely linked to other outcomes. Similarly, and were associated with alive with progression and , and associated with dead with disease, respectively, while also inversely linked other outcomes. These cross-linked genes may be used for risk-stratification and future treatment development.

摘要

多类别生存结果的分类对于精准肿瘤学很重要。机器学习(ML)算法已被用于准确分类某些癌症类型的多类别生存结果,但尚未用于肺腺癌。因此,我们比较了3种ML模型(随机森林、支持向量机[SVM]、多层感知器)和多项逻辑回归(Mlogit)模型使用TCGA对肺腺癌4类别生存结果进行分类的性能。Mlogit模型总体表现与SVM和多层感知器模型相似(微平均曲线下面积=0.82),而随机森林模型较差。令人惊讶的是,仅转录组数据和临床转录组数据似乎足以准确分类这些患者的4类别生存结果,但没有仅使用临床数据的模型表现良好。值得注意的是,[此处原文缺失具体基因名称]、[此处原文缺失具体基因名称]和[此处原文缺失具体基因名称]是与无病存活相关且与其他结果呈负相关的排名靠前的基因。同样,[此处原文缺失具体基因名称]和[此处原文缺失具体基因名称]与带进展存活相关,[此处原文缺失具体基因名称]和[此处原文缺失具体基因名称]分别与因病死亡相关,同时也与其他结果呈负相关。这些相互关联的基因可用于风险分层和未来治疗开发。

相似文献

引用本文的文献

5
Advances in the Clinical Application of High-throughput Proteomics.高通量蛋白质组学的临床应用进展
Explor Res Hypothesis Med. 2024 Jul-Sep;9(3):209-220. doi: 10.14218/erhm.2024.00006. Epub 2024 Jul 3.

本文引用的文献

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验