Suppr超能文献

捕获自动编码器的潜在空间,用于多组学整合和癌症亚型分类。

Capturing the latent space of an Autoencoder for multi-omics integration and cancer subtyping.

机构信息

Department of Bioscience and Bioengineering, Indian Institute of Technology, Jodhpur, 342037, Rajasthan, India.

Department of Bioscience and Bioengineering, Indian Institute of Technology, Jodhpur, 342037, Rajasthan, India; School of Artificial Intelligence and Data Science, Indian Institute of Technology, Jodhpur, 342037, Rajasthan, India.

出版信息

Comput Biol Med. 2022 Sep;148:105832. doi: 10.1016/j.compbiomed.2022.105832. Epub 2022 Jul 5.

Abstract

BACKGROUND AND OBJECTIVE

The motivation behind cancer subtyping is to identify subgroups of cancer patients with distinguishable phenotypes of clinical importance. It can assist in advancement of subtype-targeted based treatments. Subtype identification is a complicated task, therefore requires multi-omics data integration to identify the precise patients' subgroup. Over the years, several computational attempts have been made to identify the cancer subtypes accurately using integrative multi-omics analysis. Some studies have used Autoencoders (AE) to capture multi-omics feature integration in lower dimensions for identifying subtypes in specific types of cancer. However, capturing the highly informative latent space by learning the deep architectures of AE to attain a satisfactory generalized performance is required. Therefore, in this study, a novel AE-assisted cancer subtyping framework is presented that utilizes the compressed latent space of a Sparse AE neural network for multi-omics clustering.

METHODS

The proposed framework first performs a supervised feature selection based on the survival status of the patients. The selected features from each of the omic data are passed to the AE. The information embedded in the latent space of the trained AE neural networks are then used for cancer subtyping using Spectral clustering. The AE architecture designed in this study exhaustively searches the best compression for multi-omics data by varying the number of neurons in the hidden layers and penalizing activations within the layers.

RESULTS AND CONCLUSION

The proposed framework is applied to five different multi-omics cancer datasets taken from The Cancer Genome Atlas. It is observed that for getting a robust information bottleneck, a compression of 10-20% of the input features along with an L1 regularization penalty of 0.01 or 0.001 performs well for most of the cancer datasets. Clustering performed on this latent representation generates clusters with better silhouette scores and significantly varying survival patterns. For further biological assessment, differential expression analysis is performed between the identified subtypes of Glioblastoma multiforme (GBM), followed by enrichment analysis of the differentially expressed biomarkers. Several pathways and disease ontology terms coherent to GBM are found to be significantly associated. Varying responses of the identified GBM subtypes towards the drug Temozolomide is also tested to demonstrate its clinical importance. Hence, the study shows that AE-assisted multi-omics integration can be used for the prediction of clinically significant cancer subtypes.

摘要

背景与目的

癌症分型的动机是确定具有临床重要性的可区分表型的癌症患者亚组。它可以帮助推进基于亚组的靶向治疗。亚组识别是一项复杂的任务,因此需要整合多组学数据以识别精确的患者亚组。多年来,已经进行了一些计算尝试,以便使用整合的多组学分析准确识别癌症亚型。一些研究已经使用自动编码器(AE)来捕获多组学特征整合在较低维度中,以识别特定类型癌症中的亚型。然而,需要通过学习 AE 的深度架构来捕获信息丰富的潜在空间,以获得令人满意的泛化性能。因此,在这项研究中,提出了一种新的 AE 辅助癌症分型框架,该框架利用稀疏 AE 神经网络的压缩潜在空间进行多组学聚类。

方法

该框架首先基于患者的生存状态进行有监督的特征选择。从每个组学数据中选择的特征被传递到 AE。然后,使用谱聚类将经过训练的 AE 神经网络的潜在空间中嵌入的信息用于癌症分型。在这项研究中设计的 AE 架构通过改变隐藏层中的神经元数量并惩罚层内的激活来彻底搜索多组学数据的最佳压缩。

结果与结论

将该框架应用于来自癌症基因组图谱的五个不同的多组学癌症数据集。结果表明,为了获得稳健的信息瓶颈,输入特征的压缩率为 10%-20%,同时 L1 正则化惩罚为 0.01 或 0.001,对于大多数癌症数据集表现良好。在这个潜在表示上进行的聚类生成了具有更好轮廓得分和显著变化的生存模式的聚类。为了进一步进行生物学评估,在识别的胶质母细胞瘤(GBM)亚型之间进行差异表达分析,然后对差异表达的生物标志物进行富集分析。发现几个与 GBM 相关的途径和疾病本体术语显著相关。还测试了识别的 GBM 亚型对替莫唑胺的不同反应,以证明其临床重要性。因此,该研究表明,AE 辅助的多组学整合可用于预测具有临床意义的癌症亚型。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验