Suppr超能文献

MMDAE-HGSOC:一种基于多模态深度自动编码器的高级别浆液性卵巢癌分子亚型分类新方法。

MMDAE-HGSOC: A novel method for high-grade serous ovarian cancer molecular subtypes classification based on multi-modal deep autoencoder.

机构信息

College of Information and Computer, Taiyuan University of Technology, Taiyuan 030024, China.

College of Information and Computer, Taiyuan University of Technology, Taiyuan 030024, China.

出版信息

Comput Biol Chem. 2023 Aug;105:107906. doi: 10.1016/j.compbiolchem.2023.107906. Epub 2023 Jun 14.

Abstract

High-grade serous ovarian cancer (HGSOC) is a type of ovarian cancer developed from serous tubal intraepithelial carcinoma. The intrinsic differences among molecular subtypes are closely associated with prognosis and pathological characteristics. At present, multi-omics data integration methods include early integration and late integration. Most existing HGSOC molecular subtypes classification methods are based on the early integration of multi-omics data. The mutual interference among multi-omics data is ignored, which affects the effectiveness of feature learning. High-dimensional multi-omics data contains genes unassociated with HGSOC molecular subtypes, resulting in redundant information, which is not conducive to model training. In this paper, we propose a multi-modal deep autoencoder learning method, MMDAE-HGSOC. MiRNA expression, DNA methylation, and copy number variation (CNV) are integrated with mRNA expression data to construct a multi-omics feature space. The multi-modal deep autoencoder network is used to learn the high-level feature representation of multi-omics data. The superposition LASSO (S-LASSO) regression algorithm is proposed to fully obtain the associated genes of HGSOC molecular subtypes. The experimental results show that MMDAE-HGSOC is superior to the existing classification methods. Finally, we analyze the enrichment gene ontology (GO) terms and biological pathways of these significant genes, which are discovered during the gene selection process.

摘要

高级别浆液性卵巢癌(HGSOC)是一种源自输卵管上皮内癌的卵巢癌。分子亚型之间的固有差异与预后和病理特征密切相关。目前,多组学数据整合方法包括早期整合和晚期整合。大多数现有的 HGSOC 分子亚型分类方法都是基于多组学数据的早期整合。多组学数据之间的相互干扰被忽略,这影响了特征学习的有效性。高维多组学数据包含与 HGSOC 分子亚型无关的基因,导致冗余信息,不利于模型训练。本文提出了一种多模态深度自动编码器学习方法 MMDAE-HGSOC。将 miRNA 表达、DNA 甲基化和拷贝数变异(CNV)与 mRNA 表达数据相结合,构建多组学特征空间。使用多模态深度自动编码器网络学习多组学数据的高级特征表示。提出了叠加 LASSO(S-LASSO)回归算法,以充分获取与 HGSOC 分子亚型相关的基因。实验结果表明,MMDAE-HGSOC 优于现有的分类方法。最后,我们分析了在基因选择过程中发现的这些显著基因的富集基因本体(GO)术语和生物学途径。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验