Suppr超能文献

基于深度学习的多组学数据卵巢癌亚型识别

Deep learning-based ovarian cancer subtypes identification using multi-omics data.

作者信息

Guo Long-Yi, Wu Ai-Hua, Wang Yong-Xia, Zhang Li-Ping, Chai Hua, Liang Xue-Fang

机构信息

Second School of Clinical Medicine, Guangzhou University of Chinese Medicine, Guangzhou, 510020 China.

Center for Reproductive Medicine, Guangdong Hospital of Traditional Chinese Medicine, Guangzhou, 510120 China.

出版信息

BioData Min. 2020 Aug 24;13:10. doi: 10.1186/s13040-020-00222-x. eCollection 2020.

Abstract

BACKGROUND

Identifying molecular subtypes of ovarian cancer is important. Compared to identify subtypes using single omics data, the multi-omics data analysis can utilize more information. Autoencoder has been widely used to construct lower dimensional representation for multi-omics feature integration. However, learning in the deep architectures in Autoencoder is difficult for achieving satisfied generalization performance. To solve this problem, we proposed a novel deep learning-based framework to robustly identify ovarian cancer subtypes by using denoising Autoencoder.

RESULTS

In proposed method, the composite features of multi-omics data in the Cancer Genome Atlas were produced by denoising Autoencoder, and then the generated low-dimensional features were input into -means for clustering. At last based on the clustering results, we built the light-weighted classification model with L1-penalized logistic regression method. Furthermore, we applied the differential expression analysis and WGCNA analysis to select target genes related to molecular subtypes. We identified 34 biomarkers and 19 KEGG pathways associated with ovarian cancer.

CONCLUSIONS

The independent test results in three GEO datasets proved the robustness of our model. The literature reviewing show 19 (56%) biomarkers and 8(42.1%) KEGG pathways identified based on the classification subtypes have been proved to be associated with ovarian cancer. The outcomes indicate that our proposed method is feasible and can provide reliable results.

摘要

背景

识别卵巢癌的分子亚型很重要。与使用单一组学数据识别亚型相比,多组学数据分析可以利用更多信息。自编码器已被广泛用于构建用于多组学特征整合的低维表示。然而,在自编码器的深度架构中学习难以实现令人满意的泛化性能。为了解决这个问题,我们提出了一种基于深度学习的新框架,通过使用去噪自编码器来稳健地识别卵巢癌亚型。

结果

在所提出的方法中,通过去噪自编码器生成癌症基因组图谱中多组学数据的复合特征,然后将生成的低维特征输入到K均值聚类中。最后根据聚类结果,我们使用L1惩罚逻辑回归方法构建了轻量级分类模型。此外,我们应用差异表达分析和WGCNA分析来选择与分子亚型相关的靶基因。我们鉴定出34个生物标志物和19条与卵巢癌相关的KEGG通路。

结论

在三个GEO数据集中的独立测试结果证明了我们模型的稳健性。文献综述表明,基于分类亚型鉴定出的19个(56%)生物标志物和8个(42.1%)KEGG通路已被证明与卵巢癌相关。结果表明我们提出的方法是可行的,并且可以提供可靠的结果。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/476e/7447574/d52c21dafaaf/13040_2020_222_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验