• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

去噪自编码器提取的深度基因组特征在乳腺癌中的关联分析

Association Analysis of Deep Genomic Features Extracted by Denoising Autoencoders in Breast Cancer.

作者信息

Liu Qian, Hu Pingzhao

机构信息

Department of Biochemistry and Medical Genetics, College of Medicine, Faculty of Health Sciences, University of Manitoba, Winnipeg, MB R3E 0J9, Canada.

Research Institute in Oncology and Hematology, CancerCare Manitoba, Winnipeg, MB R3E 0V9, Canada.

出版信息

Cancers (Basel). 2019 Apr 7;11(4):494. doi: 10.3390/cancers11040494.

DOI:10.3390/cancers11040494
PMID:30959966
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6520782/
Abstract

Artificial intelligence-based unsupervised deep learning (DL) is widely used to mine multimodal big data. However, there are few applications of this technology to cancer genomics. We aim to develop DL models to extract deep features from the breast cancer gene expression data and copy number alteration (CNA) data separately and jointly. We hypothesize that the deep features are associated with patients' clinical characteristics and outcomes. Two unsupervised denoising autoencoders (DAs) were developed to extract deep features from TCGA (The Cancer Genome Atlas) breast cancer gene expression and CNA data separately and jointly. A heat map was used to view and cluster patients into subgroups based on these DL features. Fisher's exact test and Pearson' Chi-square test were applied to test the associations of patients' groups and clinical information. Survival differences between the groups were evaluated by Kaplan⁻Meier (KM) curves. Associations between each of the features and patient's overall survival were assessed using Cox's proportional hazards (COX-PH) model and a risk score for each feature set from the different omics data sets was generated from the survival regression coefficients. The risk scores for each feature set were binarized into high- and low-risk patient groups to evaluate survival differences using KM curves. Furthermore, the risk scores were traced back to their gene level DAs weights so that the three gene lists for each of the genomic data points were generated to perform gene set enrichment analysis. Patients were clustered into two groups based on concatenated features from the gene expression and CNA data and these two groups showed different overall survival rates (-value = 0.049) and different ER (Estrogen receptor) statuses (-value = 0.002, OR (odds ratio) = 0.626). All the risk scores from the gene expression and CNA data and their concatenated one were significantly associated with breast cancer survival. The patients with the high-risk group were significantly associated with patients' worse outcomes (-values ≤ 0.0023). The concatenated risk score was enriched by the AMP-activated protein kinase (AMPK) signaling pathway, the regulation of DNA-templated transcription, the regulation of nucleic acid-templated transcription, the regulation of apoptotic process, the positive regulation of gene expression, the positive regulation of cell proliferation, heart morphogenesis, the regulation of cellular macromolecule biosynthetic process, with FDR (false discovery rate) less than 0.05. We confirmed DAs can effectively extract meaningful genomic features from genomic data and concatenating multiple data sources can improve the significance of the features associated with breast cancer patients' clinical characteristics and outcomes.

摘要
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3e98/6520782/0606dbf746f9/cancers-11-00494-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3e98/6520782/9b195ee82ad5/cancers-11-00494-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3e98/6520782/2f4b08f4c292/cancers-11-00494-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3e98/6520782/fde87319d3c2/cancers-11-00494-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3e98/6520782/acb263c38532/cancers-11-00494-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3e98/6520782/58cbb3734c32/cancers-11-00494-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3e98/6520782/0606dbf746f9/cancers-11-00494-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3e98/6520782/9b195ee82ad5/cancers-11-00494-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3e98/6520782/2f4b08f4c292/cancers-11-00494-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3e98/6520782/fde87319d3c2/cancers-11-00494-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3e98/6520782/acb263c38532/cancers-11-00494-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3e98/6520782/58cbb3734c32/cancers-11-00494-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3e98/6520782/0606dbf746f9/cancers-11-00494-g006.jpg

基于人工智能的无监督深度学习(DL)被广泛用于挖掘多模态大数据。然而,这项技术在癌症基因组学中的应用却很少。我们旨在开发DL模型,分别从乳腺癌基因表达数据和拷贝数变异(CNA)数据中单独或联合提取深度特征。我们假设这些深度特征与患者的临床特征和预后相关。我们开发了两个无监督去噪自编码器(DA),分别从TCGA(癌症基因组图谱)乳腺癌基因表达和CNA数据中单独或联合提取深度特征。利用热图基于这些DL特征查看患者并将其聚类为亚组。应用Fisher精确检验和Pearson卡方检验来检验患者分组与临床信息之间的关联。通过Kaplan-Meier(KM)曲线评估各组之间的生存差异。使用Cox比例风险(COX-PH)模型评估每个特征与患者总生存之间的关联,并根据生存回归系数为来自不同组学数据集的每个特征集生成风险评分。将每个特征集的风险评分二值化为高风险和低风险患者组,以使用KM曲线评估生存差异。此外,将风险评分追溯到其基因水平的DA权重,从而生成每个基因组数据点的三个基因列表以进行基因集富集分析。根据基因表达和CNA数据的串联特征将患者聚类为两组,这两组显示出不同的总生存率(P值 = 0.049)和不同的雌激素受体(ER)状态(P值 = 0.002,优势比(OR) = 0.626)。来自基因表达和CNA数据及其串联数据的所有风险评分均与乳腺癌生存显著相关。高风险组患者与较差的预后显著相关(P值≤0.0023)。串联风险评分在AMP激活的蛋白激酶(AMPK)信号通路、DNA模板转录调控、核酸模板转录调控、凋亡过程调控、基因表达的正调控、细胞增殖的正调控、心脏形态发生、细胞大分子生物合成过程调控中富集,错误发现率(FDR)小于0.05。我们证实DA可以有效地从基因组数据中提取有意义的基因组特征,并且串联多个数据源可以提高与乳腺癌患者临床特征和预后相关特征的显著性。

相似文献

1
Association Analysis of Deep Genomic Features Extracted by Denoising Autoencoders in Breast Cancer.去噪自编码器提取的深度基因组特征在乳腺癌中的关联分析
Cancers (Basel). 2019 Apr 7;11(4):494. doi: 10.3390/cancers11040494.
2
Unsupervised feature construction and knowledge extraction from genome-wide assays of breast cancer with denoising autoencoders.使用去噪自编码器从乳腺癌全基因组检测中进行无监督特征构建和知识提取。
Pac Symp Biocomput. 2015;20:132-43.
3
Radiogenomic association of deep MR imaging features with genomic profiles and clinical characteristics in breast cancer.乳腺癌中深部磁共振成像特征与基因组图谱及临床特征的放射基因组学关联
Biomark Res. 2023 Jan 24;11(1):9. doi: 10.1186/s40364-023-00455-y.
4
Multi-omics facilitated variable selection in Cox-regression model for cancer prognosis prediction.多组学技术助力Cox回归模型中的变量选择以进行癌症预后预测。
Methods. 2017 Jul 15;124:100-107. doi: 10.1016/j.ymeth.2017.06.010. Epub 2017 Jun 13.
5
Integrating multi-omics data through deep learning for accurate cancer prognosis prediction.通过深度学习整合多组学数据,实现癌症预后的精准预测。
Comput Biol Med. 2021 Jul;134:104481. doi: 10.1016/j.compbiomed.2021.104481. Epub 2021 May 9.
6
Association Analysis of Somatic Copy Number Alteration Burden With Breast Cancer Survival.体细胞拷贝数改变负担与乳腺癌生存的关联分析
Front Genet. 2018 Oct 1;9:421. doi: 10.3389/fgene.2018.00421. eCollection 2018.
7
Predicting censored survival data based on the interactions between meta-dimensional omics data in breast cancer.基于乳腺癌元维度组学数据间的相互作用预测删失生存数据。
J Biomed Inform. 2015 Aug;56:220-8. doi: 10.1016/j.jbi.2015.05.019. Epub 2015 Jun 3.
8
PathME: pathway based multi-modal sparse autoencoders for clustering of patient-level multi-omics data.PathME:基于通路的多模态稀疏自动编码器,用于对患者层面多组学数据进行聚类。
BMC Bioinformatics. 2020 Apr 16;21(1):146. doi: 10.1186/s12859-020-3465-2.
9
Performance Comparison of Deep Learning Autoencoders for Cancer Subtype Detection Using Multi-Omics Data.使用多组学数据的深度学习自动编码器在癌症亚型检测中的性能比较
Cancers (Basel). 2021 Apr 22;13(9):2013. doi: 10.3390/cancers13092013.
10
Deep learning based feature-level integration of multi-omics data for breast cancer patients survival analysis.基于深度学习的多组学生物标志物数据特征层融合在乳腺癌患者生存分析中的应用。
BMC Med Inform Decis Mak. 2020 Sep 15;20(1):225. doi: 10.1186/s12911-020-01225-8.

引用本文的文献

1
Comprehensive multi-omics profiling identifies novel molecular subtypes of pancreatic ductal adenocarcinoma.综合多组学分析确定了胰腺导管腺癌的新型分子亚型。
Genes Dis. 2023 Oct 14;11(6):101143. doi: 10.1016/j.gendis.2023.101143. eCollection 2024 Nov.
2
Investigation of the effectiveness of a classification method based on improved DAE feature extraction for hepatitis C prediction.基于改进的深度自编码器特征提取的分类方法用于丙型肝炎预测的有效性研究。
Sci Rep. 2024 Apr 21;14(1):9143. doi: 10.1038/s41598-024-59785-y.
3
Deep Learning Based Methods for Breast Cancer Diagnosis: A Systematic Review and Future Direction.

本文引用的文献

1
Cox-nnet: An artificial neural network method for prognosis prediction of high-throughput omics data.Cox-nnet:一种用于高通量组学数据预后预测的人工神经网络方法。
PLoS Comput Biol. 2018 Apr 10;14(4):e1006076. doi: 10.1371/journal.pcbi.1006076. eCollection 2018 Apr.
2
Recurrent copy number alterations in young women with breast cancer.年轻乳腺癌女性患者的复发性拷贝数改变
Oncotarget. 2018 Jan 29;9(14):11541-11558. doi: 10.18632/oncotarget.24336. eCollection 2018 Feb 20.
3
Predicting cancer outcomes from histology and genomics using convolutional networks.
基于深度学习的乳腺癌诊断方法:系统综述与未来方向
Diagnostics (Basel). 2023 Jan 3;13(1):161. doi: 10.3390/diagnostics13010161.
4
An Unsupervised Deep Learning-Based Model Using Multiomics Data to Predict Prognosis of Patients with Stomach Adenocarcinoma.一种基于无监督深度学习的模型,利用多组学数据预测胃腺癌患者的预后。
Comput Math Methods Med. 2022 Oct 27;2022:5844846. doi: 10.1155/2022/5844846. eCollection 2022.
5
Integrated multi-omics analysis of ovarian cancer using variational autoencoders.基于变分自动编码器的卵巢癌多组学综合分析。
Sci Rep. 2021 Mar 18;11(1):6265. doi: 10.1038/s41598-021-85285-4.
6
Classification of Thyroid Nodules with Stacked Denoising Sparse Autoencoder.基于堆叠去噪稀疏自编码器的甲状腺结节分类
Int J Endocrinol. 2020 Dec 7;2020:9015713. doi: 10.1155/2020/9015713. eCollection 2020.
7
Applications of Bioinformatics in Cancer.生物信息学在癌症中的应用。
Cancers (Basel). 2019 Oct 24;11(11):1630. doi: 10.3390/cancers11111630.
使用卷积网络从组织学和基因组学预测癌症结局。
Proc Natl Acad Sci U S A. 2018 Mar 27;115(13):E2970-E2979. doi: 10.1073/pnas.1717139115. Epub 2018 Mar 12.
4
Deep learning for computational biology.用于计算生物学的深度学习。
Mol Syst Biol. 2016 Jul 29;12(7):878. doi: 10.15252/msb.20156651.
5
Model Comparison for Breast Cancer Prognosis Based on Clinical Data.基于临床数据的乳腺癌预后模型比较
PLoS One. 2016 Jan 15;11(1):e0146413. doi: 10.1371/journal.pone.0146413. eCollection 2016.
6
The Cancer Genome Atlas (TCGA): an immeasurable source of knowledge.癌症基因组图谱(TCGA):一个不可估量的知识来源。
Contemp Oncol (Pozn). 2015;19(1A):A68-77. doi: 10.5114/wo.2014.47136.
7
Unsupervised feature construction and knowledge extraction from genome-wide assays of breast cancer with denoising autoencoders.使用去噪自编码器从乳腺癌全基因组检测中进行无监督特征构建和知识提取。
Pac Symp Biocomput. 2015;20:132-43.
8
Detecting independent and recurrent copy number aberrations using interval graphs.使用区间图检测独立和复发的拷贝数异常。
Bioinformatics. 2014 Jun 15;30(12):i195-203. doi: 10.1093/bioinformatics/btu276.
9
TCGA-assembler: open-source software for retrieving and processing TCGA data.TCGA汇编程序:用于检索和处理TCGA数据的开源软件。
Nat Methods. 2014 Jun;11(6):599-600. doi: 10.1038/nmeth.2956.
10
Predictive genomics: a cancer hallmark network framework for predicting tumor clinical phenotypes using genome sequencing data.预测基因组学:使用基因组测序数据预测肿瘤临床表型的癌症标志网络框架。
Semin Cancer Biol. 2015 Feb;30:4-12. doi: 10.1016/j.semcancer.2014.04.002. Epub 2014 Apr 18.