Suppr超能文献

一种用于基因表达谱分析的基于拓扑结构的强大算法。

A robust topology-based algorithm for gene expression profiling.

作者信息

Seemann Lars, Shulman Jason, Gunaratne Gemunu H

机构信息

Department of Physics, University of Houston, Houston, TX 77204, USA.

Department of Physics, Richard Stockton College of New Jersey, Pomona, NJ 08240, USA.

出版信息

ISRN Bioinform. 2012 Nov 11;2012:381023. doi: 10.5402/2012/381023. eCollection 2012.

Abstract

Early and accurate diagnoses of cancer can significantly improve the design of personalized therapy and enhance the success of therapeutic interventions. Histopathological approaches, which rely on microscopic examinations of malignant tissue, are not conducive to timely diagnoses. High throughput genomics offers a possible new classification of cancer subtypes. Unfortunately, most clustering algorithms have not been proven sufficiently robust. We propose a novel approach that relies on the use of statistical invariants and persistent homology, one of the most exciting recent developments in topology. It identifies a sufficient but compact set of genes for the analysis as well as a core group of tightly correlated patient samples for each subtype. Partitioning occurs hierarchically and allows for the identification of genetically similar subtypes. We analyzed the gene expression profiles of 202 tumors of the brain cancer glioblastoma multiforme (GBM) given at the Cancer Genome Atlas (TCGA) site. We identify core patient groups associated with the classical, mesenchymal, and proneural subtypes of GBM. In our analysis, the neural subtype consists of several small groups rather than a single component. A subtype prediction model is introduced which partitions tumors in a manner consistent with clustering algorithms but requires the genetic signature of only 59 genes.

摘要

癌症的早期准确诊断能够显著改善个性化治疗方案的设计,并提高治疗干预的成功率。依靠对恶性组织进行显微镜检查的组织病理学方法不利于及时诊断。高通量基因组学为癌症亚型提供了一种可能的新分类方法。不幸的是,大多数聚类算法尚未被证明具有足够的鲁棒性。我们提出了一种新颖的方法,该方法依赖于统计不变量和持久同调的使用,持久同调是拓扑学中最近最令人兴奋的发展之一。它为分析确定了一组足够但紧凑的基因,以及每个亚型的一组紧密相关的核心患者样本。划分是分层进行的,并且允许识别基因相似的亚型。我们分析了癌症基因组图谱(TCGA)网站上给出的202个多形性胶质母细胞瘤(GBM)脑肿瘤的基因表达谱。我们确定了与GBM的经典、间充质和神经前体亚型相关的核心患者组。在我们的分析中,神经亚型由几个小群体组成,而不是单个成分。引入了一种亚型预测模型,该模型以与聚类算法一致的方式对肿瘤进行划分,但只需要59个基因的基因特征。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d754/4393071/ca3009403d81/ISRN.BIOINFORMATICS2012-381023.001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验