Spirina Menand Elena, De Vries-Brilland Manon, Tessier Leslie, Dauvé Jonathan, Campone Mario, Verrièle Véronique, Jrad Nisrine, Marion Jean-Marie, Chauvet Pierre, Passot Christophe, Morel Alain
Laboratoire Angevin de Recherche en Ingénierie des Systèmes (EA7315), Université d'Angers, 49035 Angers, France.
Unité de Génomique Fonctionnelle, Institut de Cancérologie de l'Ouest Nantes-Angers, 49055 Angers, France.
Biomedicines. 2024 Dec 18;12(12):2881. doi: 10.3390/biomedicines12122881.
Ovarian cancer is a complex disease with poor outcomes that affects women worldwide. The lack of successful therapeutic options for this malignancy has led to the need to identify novel biomarkers for patient stratification. Here, we aim to develop the outcome predictors based on the gene expression data as they may serve to identify categories of patients who are more likely to respond to certain therapies. We used The Cancer Genome Atlas (TCGA) ovarian cancer transcriptomic data from 372 patients and approximately 16,600 genes to train and evaluate the deep learning survival models. In addition, we collected an in-house validation dataset of 12 patients to assess the performance of the trained survival models for their direct use in clinical practice. Despite deceptive generalization capabilities, we demonstrated how our model can be interpreted to uncover biological processes associated with survival. We calculated the contributions of the input genes to the output of the best trained model and derived the corresponding molecular pathways. These pathways allowed us to stratify the TCGA patients into high-risk and low-risk groups (-value 0.025). We validated the stratification ability of the identified pathways on the in-house dataset consisting of 12 patients (-value 0.229) and on the external clinical and molecular dataset consisting of 274 patients (-value 0.006). The deep learning-based models for survival prediction with RNA-seq data could be used to detect and interpret the gene-sets associated with survival in ovarian cancer patients and open a new avenue for future research.
卵巢癌是一种预后较差的复杂疾病,影响着全球女性。针对这种恶性肿瘤缺乏成功的治疗选择,这就需要识别新的生物标志物用于患者分层。在此,我们旨在基于基因表达数据开发预后预测指标,因为它们可能有助于识别更有可能对某些疗法产生反应的患者类别。我们使用来自372名患者的癌症基因组图谱(TCGA)卵巢癌转录组数据以及约16,600个基因来训练和评估深度学习生存模型。此外,我们收集了一个包含12名患者的内部验证数据集,以评估训练后的生存模型在临床实践中直接应用的性能。尽管具有欺骗性的泛化能力,但我们展示了如何解释我们的模型以揭示与生存相关的生物学过程。我们计算了输入基因对最佳训练模型输出的贡献,并推导了相应的分子途径。这些途径使我们能够将TCGA患者分为高风险和低风险组(P值<0.025)。我们在由12名患者组成的内部数据集(P值 = 0.229)以及由274名患者组成的外部临床和分子数据集(P值 = 0.006)上验证了所识别途径的分层能力。基于深度学习的RNA测序数据生存预测模型可用于检测和解释与卵巢癌患者生存相关的基因集,并为未来研究开辟新途径。