Liu Yusha, Carbonetto Peter, Willwerscheid Jason, Oakes Scott A, Macleod Kay F, Stephens Matthew
Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.
Department of Human Genetics, University of Chicago, Chicago, IL, USA.
Nat Genet. 2025 Jan;57(1):263-273. doi: 10.1038/s41588-024-01997-z. Epub 2025 Jan 2.
Profiling tumors with single-cell RNA sequencing has the potential to identify recurrent patterns of transcription variation related to cancer progression, and to produce therapeutically relevant insights. However, strong intertumor heterogeneity can obscure more subtle patterns that are shared across tumors. Here we introduce a statistical method, generalized binary covariance decomposition (GBCD), to address this problem. We show that GBCD can decompose transcriptional heterogeneity into interpretable components-including patient-specific, dataset-specific and shared components relevant to disease subtypes-and that, in the presence of strong intertumor heterogeneity, it can produce more interpretable results than existing methods. Applied to data on pancreatic ductal adenocarcinoma, GBCD produced a refined characterization of existing tumor subtypes, and identified a gene expression program prognostic of poor survival independent of tumor stage and subtype. The gene expression program is enriched for genes involved in stress responses, and suggests a role for the integrated stress response in pancreatic ductal adenocarcinoma.
通过单细胞RNA测序对肿瘤进行分析,有潜力识别与癌症进展相关的转录变异复发模式,并产生具有治疗相关性的见解。然而,强烈的肿瘤间异质性可能会掩盖肿瘤之间共享的更细微模式。在此,我们引入一种统计方法——广义二元协方差分解(GBCD)来解决这一问题。我们表明,GBCD可将转录异质性分解为可解释的成分,包括患者特异性、数据集特异性以及与疾病亚型相关的共享成分,并且在存在强烈肿瘤间异质性的情况下,它能比现有方法产生更具可解释性的结果。应用于胰腺导管腺癌数据时,GBCD对现有肿瘤亚型进行了精细表征,并识别出一个独立于肿瘤分期和亚型的、预后不良的基因表达程序。该基因表达程序富含参与应激反应的基因,并提示整合应激反应在胰腺导管腺癌中发挥作用。