Department of Electronic Engineering, City University of Hong Kong, 83 Tat Chee Ave, Kowloon Tong, Hong Kong.
BMC Cancer. 2018 May 29;18(1):603. doi: 10.1186/s12885-018-4546-8.
Pancreatic ductal adenocarcinoma (PDAC) is the fourth leading cause of cancer related death in the world with a five-year survival rate of less than 5%. Not all PDAC are the same, because there exist intra-tumoral heterogeneity between PDAC, which poses a great challenge to personalized treatments for PDAC.
To dissect the molecular heterogeneity of PDAC, we performed a retrospective meta-analysis on whole transcriptome data from more than 1200 PDAC patients. Subtypes were identified based on non-negative matrix factorization (NMF) biclustering method. We used the gene set enrichment analysis (GSEA) and survival analysis to conduct the molecular and clinical characterization of the identified subtypes, respectively.
Six molecular and clinical distinct subtypes of PDAC: L1-L6, are identified and grouped into tumor-specific (L1, L2 and L6) and stroma-specific subtypes (L3, L4 and L5). For tumor-specific subtypes, L1 (~ 22%) has enriched carbohydrate metabolism-related gene sets and has intermediate survival. L2 (~ 22%) has the worst clinical outcomes, and is enriched for cell proliferation-related gene sets. About 23% patients can be classified into L6, which leads to intermediate survival and is enriched for lipid and protein metabolism-related gene sets. Stroma-specific subtypes may contain high non-epithelial contents such as collagen, immune and islet cells, respectively. For instance, L3 (~ 12%) has poor survival and is enriched for collagen-associated gene sets. L4 (~ 14%) is enriched for various immune-related gene sets and has relatively good survival. And L5 (~ 7%) has good clinical outcomes and is enriched for neurotransmitter and insulin secretion related gene sets. In the meantime, we identified 160 subtype-specific markers and built a deep learning-based classifier for PDAC. We also applied our classification system on validation datasets and observed much similar molecular and clinical characteristics between subtypes.
Our study is the largest cohort of PDAC gene expression profiles investigated so far, which greatly increased the statistical power and provided more robust results. We identified six molecular and clinical distinct subtypes to describe a more complete picture of the PDAC heterogeneity. The 160 subtype-specific markers and a deep learning based classification system may be used to better stratify PDAC patients for personalized treatments.
胰腺导管腺癌(PDAC)是全球第四大癌症相关死亡原因,五年生存率低于 5%。并非所有 PDAC 都是相同的,因为 PDAC 之间存在肿瘤内异质性,这对 PDAC 的个性化治疗构成了巨大挑战。
为了剖析 PDAC 的分子异质性,我们对来自 1200 多名 PDAC 患者的全转录组数据进行了回顾性荟萃分析。基于非负矩阵分解(NMF)双聚类方法鉴定亚型。我们分别使用基因集富集分析(GSEA)和生存分析对鉴定的亚型进行分子和临床特征描述。
鉴定并分组为肿瘤特异性(L1、L2 和 L6)和基质特异性亚型(L3、L4 和 L5)的 6 种分子和临床不同的 PDAC 亚型:L1-L6。对于肿瘤特异性亚型,L1(22%)富含碳水化合物代谢相关基因集,且生存情况中等。L2(22%)具有最差的临床结局,富含细胞增殖相关基因集。约 23%的患者可归类为 L6,其生存情况中等,且富含脂质和蛋白质代谢相关基因集。基质特异性亚型可能包含高含量的非上皮细胞,如胶原、免疫和胰岛细胞等。例如,L3(12%)的生存情况较差,富含胶原相关基因集。L4(14%)富含各种免疫相关基因集,且生存情况相对较好。L5(~7%)具有较好的临床结局,富含神经递质和胰岛素分泌相关基因集。同时,我们鉴定了 160 种亚型特异性标志物,并建立了用于 PDAC 的深度学习分类器。我们还将我们的分类系统应用于验证数据集,观察到亚型之间存在相似的分子和临床特征。
本研究是迄今为止对 PDAC 基因表达谱进行的最大队列研究,大大提高了统计功效并提供了更可靠的结果。我们鉴定了 6 种分子和临床不同的亚型,以描述 PDAC 异质性的更完整图景。160 种亚型特异性标志物和基于深度学习的分类系统可用于更好地对 PDAC 患者进行分层,以进行个性化治疗。