Suppr超能文献

一种利用基因表达数据识别人类结肠癌分子亚型的综合方法。

An Integrated Approach for Identifying Molecular Subtypes in Human Colon Cancer Using Gene Expression Data.

作者信息

Wang Wen-Hui, Xie Ting-Yan, Xie Guang-Lei, Ren Zhong-Lu, Li Jin-Ming

机构信息

State Key Laboratory of Organ Failure Research, Division of Nephrology, Southern Medical University, Guangzhou 510515, China.

Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, China.

出版信息

Genes (Basel). 2018 Aug 2;9(8):397. doi: 10.3390/genes9080397.

Abstract

Identifying molecular subtypes of colorectal cancer (CRC) may allow for more rational, patient-specific treatment. Various studies have identified molecular subtypes for CRC using gene expression data, but they are inconsistent and further research is necessary. From a methodological point of view, a progressive approach is needed to identify molecular subtypes in human colon cancer using gene expression data. We propose an approach to identify the molecular subtypes of colon cancer that integrates denoising by the Bayesian robust principal component analysis (BRPCA) algorithm, hierarchical clustering by the directed bubble hierarchical tree (DBHT) algorithm, and feature gene selection by an improved differential evolution based feature selection method (DEFS) algorithm. In this approach, the normal samples being completely and exclusively clustered into one class is considered to be the standard of reasonable clustering subtypes, and the feature selection pays attention to imbalances of samples among subtypes. With this approach, we identified the molecular subtypes of colon cancer on the mRNA gene expression dataset of 153 colon cancer samples and 19 normal control samples of the Cancer Genome Atlas (TCGA) project. The colon cancer was clustered into 7 subtypes with 44 feature genes. Our approach could identify finer subtypes of colon cancer with fewer feature genes than the other two recent studies and exhibits a generic methodology that might be applied to identify the subtypes of other cancers.

摘要

识别结直肠癌(CRC)的分子亚型可能有助于实现更合理的、针对患者的治疗。各种研究已利用基因表达数据识别出CRC的分子亚型,但这些结果并不一致,因此有必要进一步开展研究。从方法学的角度来看,需要一种渐进的方法来利用基因表达数据识别人类结肠癌的分子亚型。我们提出了一种识别结肠癌分子亚型的方法,该方法整合了通过贝叶斯稳健主成分分析(BRPCA)算法进行的去噪、通过定向气泡层次树(DBHT)算法进行的层次聚类,以及通过改进的基于差分进化的特征选择方法(DEFS)算法进行的特征基因选择。在这种方法中,将正常样本完全且唯一地聚类为一类被视为合理聚类亚型的标准,并且特征选择会关注亚型之间样本的不平衡情况。通过这种方法,我们在癌症基因组图谱(TCGA)项目的153个结肠癌样本和19个正常对照样本的mRNA基因表达数据集上识别出了结肠癌的分子亚型。结肠癌被聚类为7个亚型,共有44个特征基因。与最近的其他两项研究相比,我们的方法能够用更少的特征基因识别出更精细的结肠癌亚型,并且展示了一种可能适用于识别其他癌症亚型的通用方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ba58/6115727/5a77fdd9938b/genes-09-00397-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验