Suppr超能文献

机器学习算法和深度神经网络鉴定出肝癌的一种新型亚型。

Machine learning algorithm and deep neural networks identified a novel subtype in hepatocellular carcinoma.

机构信息

Department of Engineering Structure and Mechanics, Wuhan University of Technology, Wuhan, Hubei, China.

Department of Science and Education, Shenzhen Samii Medical Center, Shenzhen, Guangdong, China.

出版信息

Cancer Biomark. 2022;35(3):305-320. doi: 10.3233/CBM-220147.

Abstract

BACKGROUND

Hepatocellular carcinoma (HCC) is one of the most common malignant tumors. Due to the lack of specific characteristics in the early stage of the disease, patients are usually diagnosed in the advanced stage of disease progression.

OBJECTIVE

This study used machine learning algorithms to identify key genes in the progression of hepatocellular carcinoma and constructed a prediction model to predict the survival risk of HCC patients.

METHODS

The transcriptome data and clinical information were downloaded from The Cancer Genome Atlas (TCGA) and Gene Expression Omnibus (GEO). The differential expression analysis and COX proportional-hazards model participated in the identification of survival-related genes. K-Means, Random forests, and LASSO regression are involved in identifying novel subtypes of HCC and screening key genes. The prediction model was constructed by deep neural networks (DNN), and Gene Set Enrichment Analysis (GSEA) reveals the metabolic pathways where key genes are located.

RESULTS

Two subtypes were identified with significantly different survival rates (p< 0.0001, AUC = 0.720) and 17 key genes associated with the subtypes. The accuracy rate of the deep neural network prediction model is greater than 93.3%. The GSEA analysis found that the survival-related genes were significantly enriched in hallmark gene sets in the MSigDB database.

CONCLUSIONS

In this study, we used machine learning algorithms to screen out 17 genes related to the survival risk of HCC patients, and trained a DNN model based on them to predict the survival risk of HCC patients. The genes that make up the model are all key genes that affect the formation and development of cancer.

摘要

背景

肝细胞癌(HCC)是最常见的恶性肿瘤之一。由于疾病早期缺乏特异性特征,患者通常在疾病进展的晚期被诊断。

目的

本研究使用机器学习算法识别肝细胞癌进展中的关键基因,并构建预测模型来预测 HCC 患者的生存风险。

方法

从癌症基因组图谱(TCGA)和基因表达综合数据库(GEO)下载转录组数据和临床信息。差异表达分析和 COX 比例风险模型参与鉴定与生存相关的基因。K-Means、随机森林和 LASSO 回归参与鉴定 HCC 的新型亚型和筛选关键基因。通过深度神经网络(DNN)构建预测模型,并进行基因集富集分析(GSEA)揭示关键基因所在的代谢途径。

结果

鉴定出两种具有显著不同生存率的亚型(p<0.0001,AUC=0.720)和 17 个与亚型相关的关键基因。深度神经网络预测模型的准确率大于 93.3%。GSEA 分析发现,生存相关基因在 MSigDB 数据库中的标志性基因集中显著富集。

结论

本研究使用机器学习算法筛选出 17 个与 HCC 患者生存风险相关的基因,并基于这些基因训练了一个 DNN 模型来预测 HCC 患者的生存风险。构成该模型的基因都是影响癌症形成和发展的关键基因。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验