Alcaraz Nicolas, List Markus, Batra Richa, Vandin Fabio, Ditzel Henrik J, Baumbach Jan
Department of Mathematics and Computer Science, University of Southern Denmark, 5230 Odense, Denmark.
Department of Cancer and Inflammation Research, Institute of Molecular Medicine, University of Southern Denmark, 5000 Odense, Denmark.
Nucleic Acids Res. 2017 Sep 19;45(16):e151. doi: 10.1093/nar/gkx642.
Gene expression profiles have been extensively discussed as an aid to guide the therapy by predicting disease outcome for the patients suffering from complex diseases, such as cancer. However, prediction models built upon single-gene (SG) features show poor stability and performance on independent datasets. Attempts to mitigate these drawbacks have led to the development of network-based approaches that integrate pathway information to produce meta-gene (MG) features. Also, MG approaches have only dealt with the two-class problem of good versus poor outcome prediction. Stratifying patients based on their molecular subtypes can provide a detailed view of the disease and lead to more personalized therapies. We propose and discuss a novel MG approach based on de novo pathways, which for the first time have been used as features in a multi-class setting to predict cancer subtypes. Comprehensive evaluation in a large cohort of breast cancer samples from The Cancer Genome Atlas (TCGA) revealed that MGs are considerably more stable than SG models, while also providing valuable insight into the cancer hallmarks that drive them. In addition, when tested on an independent benchmark non-TCGA dataset, MG features consistently outperformed SG models. We provide an easy-to-use web service at http://pathclass.compbio.sdu.dk where users can upload their own gene expression datasets from breast cancer studies and obtain the subtype predictions from all the classifiers.
基因表达谱作为一种通过预测复杂疾病(如癌症)患者的疾病转归来指导治疗的辅助手段,已经得到了广泛的讨论。然而,基于单基因(SG)特征构建的预测模型在独立数据集上表现出较差的稳定性和性能。为了缓解这些缺点,人们尝试开发基于网络的方法,该方法整合通路信息以产生元基因(MG)特征。此外,MG方法仅处理了良好与不良预后预测的二分类问题。根据分子亚型对患者进行分层可以提供疾病的详细视图,并导致更个性化的治疗。我们提出并讨论了一种基于从头通路的新型MG方法,该方法首次在多分类设置中用作预测癌症亚型的特征。对来自癌症基因组图谱(TCGA)的大量乳腺癌样本进行的综合评估表明,MG比SG模型稳定得多,同时还能为驱动它们的癌症特征提供有价值的见解。此外,在独立的非TCGA基准数据集上进行测试时,MG特征始终优于SG模型。我们在http://pathclass.compbio.sdu.dk提供了一个易于使用的网络服务,用户可以上传自己来自乳腺癌研究的基因表达数据集,并从所有分类器中获得亚型预测结果。