Benard Brooks A, Lalgudi Chinmay K, Ilerten Ilayda, Wang Ruohan, Gentles Andrew J
Department of Pathology, Stanford University, Stanford, CA, 94035, USA.
Department of Biochemistry, Stanford University, Stanford, CA, 94035, USA.
bioRxiv. 2025 Aug 28:2025.08.22.671849. doi: 10.1101/2025.08.22.671849.
Gene expression can be used to define prognostic and predictive biomarkers across cancers and treatment modalities. PRECOG (https://precog.stanford.edu) is a compendium of datasets with gene expression and clinical outcomes that facilitates visualization of associations between genomic profiles and patient survival. Here, we augment the existing PRECOG with new datasets in previously poorly represented adult cancer types, as well as adding annotated pediatric and immunotherapy treated cohorts. Pediatric PRECOG comprises ~4,000 patients across 12 cancers; while the immunotherapy cohort (ICI PRECOG) contains ~4,500 patients across 20 cancer subtypes from 80 distinct datasets across 52 studies. We compute and visualize associations of gene expression with survival outcomes using Cox regression for time-to-event, or logistic regression for responder vs non-responder, across all datasets. We also estimate cell type fractions in samples via computational deconvolution using CIBERSORTx, to identify survival associations at the level of cell types. All expression data, clinical annotations, and gene and cell type survival z-scores and meta z-scores for adult, pediatric, and ICI PRECOG, are available for interactive analysis and download, along with Kaplan-Meier and boxplot visualizations. This updated resource will provide new insights into biomarkers for specific therapies, populations, and cancer types.
基因表达可用于定义各类癌症和治疗方式中的预后和预测生物标志物。PRECOG(https://precog.stanford.edu)是一个包含基因表达和临床结果的数据集汇编,有助于直观呈现基因组图谱与患者生存率之间的关联。在此,我们用此前代表性不足的成人癌症类型的新数据集扩充现有的PRECOG,同时增加带注释的儿科和接受免疫治疗的队列。儿科PRECOG包含12种癌症的约4000名患者;而免疫治疗队列(ICI PRECOG)包含来自52项研究的80个不同数据集的20种癌症亚型的约4500名患者。我们使用Cox回归分析事件发生时间或逻辑回归分析反应者与非反应者,在所有数据集中计算并直观呈现基因表达与生存结果的关联。我们还通过使用CIBERSORTx进行计算反卷积来估计样本中的细胞类型比例,以在细胞类型水平上识别生存关联。成人、儿科和ICI PRECOG的所有表达数据、临床注释以及基因和细胞类型生存z分数和元z分数,均可用于交互式分析和下载,同时还提供Kaplan-Meier曲线和箱线图可视化。这个更新后的资源将为特定疗法、人群和癌症类型的生物标志物提供新的见解。