Suppr超能文献

深度学习基于内部基因组表达排名识别胶质母细胞瘤亚型。

Deep learning identified glioblastoma subtypes based on internal genomic expression ranks.

机构信息

Department of Neurosurgery, Xijing Hospital, Fourth Military Medical University, Xi'an, Shaanxi Province, People's Republic of China.

Department of Pharmacology, School of Pharmacy, Fourth Military Medical University, Xi'an, Shaanxi Province, People's Republic of China.

出版信息

BMC Cancer. 2022 Jan 20;22(1):86. doi: 10.1186/s12885-022-09191-2.

Abstract

BACKGROUND

Glioblastoma (GBM) can be divided into subtypes according to their genomic features, including Proneural (PN), Neural (NE), Classical (CL) and Mesenchymal (ME). However, it is a difficult task to unify various genomic expression profiles which were standardized with various procedures from different studies and to manually classify a given GBM sample into a subtype.

METHODS

An algorithm was developed to unify the genomic profiles of GBM samples into a standardized normal distribution (SND), based on their internal expression ranks. Deep neural networks (DNN) and convolutional DNN (CDNN) models were trained on original and SND data. In addition, expanded SND data by combining various The Cancer Genome Atlas (TCGA) datasets were used to improve the robustness and generalization capacity of the CDNN models.

RESULTS

The SND data kept unimodal distribution similar to their original data, and also kept the internal expression ranks of all genes for each sample. CDNN models trained on the SND data showed significantly higher accuracy compared to DNN and CDNN models trained on primary expression data. Interestingly, the CDNN models classified the NE subtype with the lowest accuracy in the GBM datasets, expanded datasets and in IDH wide type GBMs, consistent with the recent studies that NE subtype should be excluded. Furthermore, the CDNN models also recognized independent GBM datasets, even with small set of genomic expressions.

CONCLUSIONS

The GBM expression profiles can be transformed into unified SND data, which can be used to train CDNN models with high accuracy and generalization capacity. These models suggested NE subtype may be not compatible with the 4 subtypes classification system.

摘要

背景

根据基因组特征,胶质母细胞瘤(GBM)可分为神经前体细胞型(PN)、神经型(NE)、经典型(CL)和间质型(ME)等亚型。然而,将来自不同研究且采用不同标准化流程的各种基因组表达谱统一起来,并将给定的 GBM 样本手动分类为特定亚型是一项艰巨的任务。

方法

基于内部表达顺序,我们开发了一种算法,将 GBM 样本的基因组谱统一为标准化正态分布(SND)。我们在原始数据和 SND 数据上训练了深度神经网络(DNN)和卷积 DNN(CDNN)模型。此外,我们还使用来自各种癌症基因组图谱(TCGA)数据集的扩展 SND 数据来提高 CDNN 模型的稳健性和泛化能力。

结果

SND 数据保持了与原始数据相似的单峰分布,并且保留了每个样本所有基因的内部表达顺序。与在原始表达数据上训练的 DNN 和 CDNN 模型相比,在 SND 数据上训练的 CDNN 模型显示出显著更高的准确性。有趣的是,在 GBM 数据集、扩展数据集和 IDH 野生型 GBM 中,CDNN 模型对 NE 亚型的分类准确性最低,这与最近的研究一致,即应排除 NE 亚型。此外,CDNN 模型还可以识别独立的 GBM 数据集,即使基因组表达数量较少。

结论

GBM 表达谱可以转换为统一的 SND 数据,可用于训练具有高精度和泛化能力的 CDNN 模型。这些模型表明,NE 亚型可能与 4 种亚型分类系统不兼容。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7fcf/8780813/f8159aee300c/12885_2022_9191_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验