Biomedical Informatics Research Lab, School of Basic Medicine and Clinical Pharmacy, Pharmaceutical University, 211198, Nanjing, China.
Institute of Innovative Drug Discovery and Development, China Pharmaceutical University, 211198, Nanjing, China.
Funct Integr Genomics. 2024 Nov 28;24(6):224. doi: 10.1007/s10142-024-01501-0.
Pathways-based clustering methods have been proposed to explore tumor heterogeneity. However, such methods are currently disadvantageous in that specific pathways need to be explicitly claimed. We developed the PathClustNet algorithm, a pathway-based clustering method designed to identify cancer subtypes. This method first detects gene clusters and identifies overrepresented pathways associated with them. Based on the pathway enrichment scores, it reveals cancer subtypes by clustering analysis. We applied the method to TCGA pan-cancer data and identified four pan-cancer subtypes, termed C1, C2, C3 and C4. C1 exhibited high metabolic activity, favorable survival, and the lowest TP53 mutation rate. C2 had high immune, developmental, and stromal pathway activities, the lowest tumor purity, and intratumor heterogeneity. C3, which overexpressed cell cycle and DNA repair pathways, was the most genomically unstable and had the highest TP53 mutation rate. C4 overrepresented neuronal pathways, with the lowest response rate to chemotherapy, but the highest tumor purity and genomic stability. Furthermore, age showed positive correlations with most pathways but a negative correlation with neuronal pathways. Smoking, viral infections, and alcohol use were found to affect the activities of neuron, cell cycle, immune, stromal, developmental, and metabolic pathway in varying degrees. The PathClustNet algorithm unveils a novel classification of pan-cancer based on metabolic, immune, stromal, developmental, cell cycle, and neuronal pathways. These subtypes display different molecular and clinical features to warrant the investigation of precision oncology.
基于通路的聚类方法已被提出用于探索肿瘤异质性。然而,目前这些方法的缺点是需要明确提出特定的通路。我们开发了一种基于通路的聚类方法——PathClustNet 算法,用于识别癌症亚型。该方法首先检测基因簇,并识别与其相关的高表达通路。基于通路富集分数,通过聚类分析揭示癌症亚型。我们将该方法应用于 TCGA 泛癌数据,并鉴定出四个泛癌亚型,分别命名为 C1、C2、C3 和 C4。C1 表现出高代谢活性、良好的生存和最低的 TP53 突变率。C2 具有高免疫、发育和基质通路活性、最低的肿瘤纯度和肿瘤内异质性。C3 表达细胞周期和 DNA 修复通路,是最不稳定的基因组,具有最高的 TP53 突变率。C4 过度表达神经元通路,对化疗的反应率最低,但肿瘤纯度和基因组稳定性最高。此外,年龄与大多数通路呈正相关,与神经元通路呈负相关。吸烟、病毒感染和饮酒被发现不同程度地影响神经元、细胞周期、免疫、基质、发育和代谢通路的活性。PathClustNet 算法揭示了一种基于代谢、免疫、基质、发育、细胞周期和神经元通路的新型泛癌分类。这些亚型表现出不同的分子和临床特征,值得进一步研究精准肿瘤学。