Jiang Limin, Xiao Yongkang, Ding Yijie, Tang Jijun, Guo Fei
School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, China.
School of Chemical Engineering and Technology, Tianjin University, Tianjin, China.
Front Genet. 2019 Feb 5;10:20. doi: 10.3389/fgene.2019.00020. eCollection 2019.
Discovering cancer subtypes is useful for guiding clinical treatment of multiple cancers. Progressive profile technologies for tissue have accumulated diverse types of data. Based on these types of expression data, various computational methods have been proposed to predict cancer subtypes. It is crucial to study how to better integrate these multiple profiles of data. In this paper, we collect multiple profiles of data for five cancers on The Cancer Genome Atlas (TCGA). Then, we construct three similarity kernels for all patients of the same cancer by gene expression, miRNA expression and isoform expression data. We also propose a novel unsupervised multiple kernel fusion method, Similarity Kernel Fusion (SKF), in order to integrate three similarity kernels into one combined kernel. Finally, we make use of spectral clustering on the integrated kernel to predict cancer subtypes. In the experimental results, the -values from the Cox regression model and survival curve analysis can be used to evaluate the performance of predicted subtypes on three datasets. Our kernel fusion method, SKF, has outstanding performance compared with single kernel and other multiple kernel fusion strategies. It demonstrates that our method can accurately identify more accurate subtypes on various kinds of cancers. Our cancer subtype prediction method can identify essential genes and biomarkers for disease diagnosis and prognosis, and we also discuss the possible side effects of therapies and treatment.
发现癌症亚型有助于指导多种癌症的临床治疗。针对组织的渐进式剖析技术积累了各种类型的数据。基于这些表达数据类型,人们提出了各种计算方法来预测癌症亚型。研究如何更好地整合这些多组数据至关重要。在本文中,我们在癌症基因组图谱(TCGA)上收集了五种癌症的多组数据。然后,我们通过基因表达、miRNA表达和异构体表达数据为同一种癌症的所有患者构建了三个相似性核。我们还提出了一种新颖的无监督多核融合方法,即相似性核融合(SKF),以便将三个相似性核整合为一个组合核。最后,我们利用集成核上的谱聚类来预测癌症亚型。在实验结果中,Cox回归模型的p值和生存曲线分析可用于评估三个数据集上预测亚型的性能。与单核和其他多核融合策略相比,我们的核融合方法SKF具有出色的性能。这表明我们的方法可以在各种癌症上准确识别更精确的亚型。我们的癌症亚型预测方法可以识别用于疾病诊断和预后的关键基因和生物标志物,并且我们还讨论了治疗方法可能产生的副作用。