Department of Applied Research and Technological Development, Fondazione IRCCS Istituto Nazionale dei Tumori di Milano, Milan, Italy.
Unit of Biostatistics, Department of Molecular and Translational Medicine, University of Brescia, Brescia, Italy.
Cancer Med. 2023 Apr;12(8):10156-10168. doi: 10.1002/cam4.5719. Epub 2023 Mar 20.
Cholangiocarcinoma (CC) is a rare and aggressive disease with limited therapeutic options and a poor prognosis. All available public records of cohorts reporting transcriptomic data on intrahepatic cholangiocarcinoma (ICC) and extrahepatic cholangiocarcinoma (ECC) were collected with the aim to provide a comprehensive gene expression-based classification with clinical relevance.
A total of 543 patients with primary tumor tissues profiled by RNAseq and microarray platforms from seven public datasets were used as a discovery set to identify distinct biological subgroups. Group predictors developed on the discovery sets were applied to a single cohort of 131 patients profiled with RNAseq for validation and assessment of clinical relevance leveraging machine learning techniques.
By unsupervised clustering analysis of gene expression data we identified both in the ICC and ECC discovery datasets four subgroups characterized by a distinct type of immune infiltrate and signaling pathways. We next developed class predictors using short gene list signatures and identified in an independent dataset subgroups of ICC tumors at different prognosis.
The developed class-predictor allows identification of CC subgroups with specific biological features and clinical behavior at single-sample level. Such results represent the starting point for a complete molecular characterization of CC, including integration of genomics data to develop in clinical practice.
胆管癌(CC)是一种罕见且侵袭性强的疾病,治疗选择有限,预后较差。本研究旨在提供一种具有临床相关性的基于全转录组数据的综合分类方法,为此收集了所有公开的报告肝内胆管癌(ICC)和肝外胆管癌(ECC)转录组数据的队列记录。
共纳入 7 个公共数据集的 543 例原发性肿瘤组织 RNAseq 和微阵列平台进行基因表达谱分析,作为发现集,以识别不同的生物学亚群。在发现集上开发的分组预测因子应用于 131 例接受 RNAseq 检测的单个队列进行验证,并利用机器学习技术评估临床相关性。
通过对基因表达数据进行无监督聚类分析,我们在 ICC 和 ECC 发现数据集两个亚组中均鉴定出具有独特免疫浸润和信号通路类型的特征。接下来,我们使用短基因列表特征开发了分类预测因子,并在独立数据集ICC 肿瘤亚组中鉴定出了不同预后的肿瘤。
所开发的分类预测因子可识别具有特定生物学特征和临床行为的 CC 亚组,达到单一样本水平。这些结果为 CC 的全面分子特征分析提供了起点,包括整合基因组数据以在临床实践中开发。