Suppr超能文献

基于通路的深度聚类在癌症分子分型中的应用。

Pathway-based deep clustering for molecular subtyping of cancer.

机构信息

Analytics and Data Science, Kennesaw State University, Kennesaw, USA.

Department of Computer Science, Kennesaw State University, Marietta, USA.

出版信息

Methods. 2020 Feb 15;173:24-31. doi: 10.1016/j.ymeth.2019.06.017. Epub 2019 Jun 25.

Abstract

Cancer is a genetic disease comprising multiple subtypes that have distinct molecular characteristics and clinical features. Cancer subtyping helps in improving personalized treatment and making decision, as different cancer subtypes respond differently to the treatment. The increasing availability of cancer related genomic data provides the opportunity to identify molecular subtypes. Several unsupervised machine learning techniques have been applied on molecular data of the tumor samples to identify cancer subtypes that are genetically and clinically distinct. However, most clustering methods often fail to efficiently cluster patients due to the challenges imposed by high-throughput genomic data and its non-linearity. In this paper, we propose a pathway-based deep clustering method (PACL) for molecular subtyping of cancer, which incorporates gene expression and biological pathway database to group patients into cancer subtypes. The main contribution of our model is to discover high-level representations of biological data by learning complex hierarchical and nonlinear effects of pathways. We compared the performance of our model with a number of benchmark clustering methods that recently have been proposed in cancer subtypes. We assessed the hypothesis that clusters (subtypes) may be associated to different survivals by logrank tests. PACL showed the lowest p-value of the logrank test against the benchmark methods. It demonstrates the patient groups clustered by PACL may correspond to subtypes which are significantly associated with distinct survival distributions. Moreover, PACL provides a solution to comprehensively identify subtypes and interpret the model in the biological pathway level. The open-source software of PACL in PyTorch is publicly available at https://github.com/tmallava/PACL.

摘要

癌症是一种遗传疾病,包含多个具有不同分子特征和临床特征的亚型。癌症分型有助于改善个性化治疗和决策,因为不同的癌症亚型对治疗的反应不同。越来越多的癌症相关基因组数据为识别分子亚型提供了机会。已经应用了几种无监督机器学习技术对肿瘤样本的分子数据进行分析,以识别在遗传和临床上不同的癌症亚型。然而,由于高通量基因组数据及其非线性带来的挑战,大多数聚类方法往往无法有效地对患者进行聚类。在本文中,我们提出了一种基于通路的深度学习聚类方法(PACL),用于癌症的分子分型,该方法将基因表达和生物通路数据库相结合,将患者分为癌症亚型。我们模型的主要贡献是通过学习通路的复杂层次和非线性效应,发现生物数据的高级表示。我们将我们的模型与最近在癌症亚型中提出的一些基准聚类方法进行了性能比较。我们评估了这样一个假设,即聚类(亚型)可能与不同的存活率相关,通过对数秩检验进行检验。PACL 显示出对数秩检验中针对基准方法的最低 p 值。这表明通过 PACL 聚类的患者组可能对应于与不同生存分布显著相关的亚型。此外,PACL 提供了一种全面识别亚型并在生物通路层面解释模型的解决方案。PACL 的 PyTorch 开源软件可在 https://github.com/tmallava/PACL 上获得。

相似文献

引用本文的文献

本文引用的文献

5
Epidermal growth factor receptor in glioblastoma.胶质母细胞瘤中的表皮生长因子受体
Oncol Lett. 2017 Jul;14(1):512-516. doi: 10.3892/ol.2017.6221. Epub 2017 May 22.
6
7
Subtypes of Ovarian Cancer and Ovarian Cancer Screening.卵巢癌的亚型与卵巢癌筛查
Diagnostics (Basel). 2017 Mar 2;7(1):12. doi: 10.3390/diagnostics7010012.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验