School of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing, China.
Key Laboratory of Molecular Biology for High Cancer Incidence Coastal Chaoshan Area, Shantou University Medical College, Shantou, China.
Brief Bioinform. 2020 Jul 15;21(4):1411-1424. doi: 10.1093/bib/bbz078.
With the increasing awareness of heterogeneity in cancers, better prediction of cancer prognosis is much needed for more personalized treatment. Recently, extensive efforts have been made to explore the variations in gene expression for better prognosis. However, the prognostic gene signatures predicted by most existing methods have little robustness among different datasets of the same cancer. To improve the robustness of the gene signatures, we propose a novel high-frequency sub-pathways mining approach (HiFreSP), integrating a randomization strategy with gene interaction pathways. We identified a six-gene signature (CCND1, CSF3R, E2F2, JUP, RARA and TCF7) in esophageal squamous cell carcinoma (ESCC) by HiFreSP. This signature displayed a strong ability to predict the clinical outcome of ESCC patients in two independent datasets (log-rank test, P = 0.0045 and 0.0087). To further show the predictive performance of HiFreSP, we applied it to two other cancers: pancreatic adenocarcinoma and breast cancer. The identified signatures show high predictive power in all testing datasets of the two cancers. Furthermore, compared with the two popular prognosis signature predicting methods, the least absolute shrinkage and selection operator penalized Cox proportional hazards model and the random survival forest, HiFreSP showed better predictive accuracy and generalization across all testing datasets of the above three cancers. Lastly, we applied HiFreSP to 8137 patients involving 20 cancer types in the TCGA database and found high-frequency prognosis-associated pathways in many cancers. Taken together, HiFreSP shows higher prognostic capability and greater robustness, and the identified signatures provide clinical guidance for cancer prognosis. HiFreSP is freely available via GitHub: https://github.com/chunquanlipathway/HiFreSP.
随着人们对癌症异质性认识的不断提高,更好地预测癌症预后对于更个性化的治疗非常重要。最近,人们已经做出了广泛的努力来探索基因表达的变化,以更好地预测预后。然而,大多数现有方法预测的预后基因特征在同一癌症的不同数据集之间的稳健性较差。为了提高基因特征的稳健性,我们提出了一种新的高频子路径挖掘方法(HiFreSP),该方法将随机化策略与基因互作途径相结合。我们通过 HiFreSP 在食管鳞状细胞癌(ESCC)中鉴定出一个由六个基因组成的特征(CCND1、CSF3R、E2F2、JUP、RARA 和 TCF7)。该特征在两个独立的数据集(对数秩检验,P=0.0045 和 0.0087)中显示出强烈的预测 ESCC 患者临床结局的能力。为了进一步展示 HiFreSP 的预测性能,我们将其应用于另外两种癌症:胰腺癌和乳腺癌。鉴定出的特征在两种癌症的所有测试数据集中均显示出较高的预测能力。此外,与两种流行的预后特征预测方法——最小绝对收缩和选择算子惩罚 Cox 比例风险模型和随机生存森林相比,HiFreSP 在上述三种癌症的所有测试数据集中均表现出更高的预测准确性和泛化能力。最后,我们将 HiFreSP 应用于 TCGA 数据库中涉及 20 种癌症类型的 8137 名患者,发现许多癌症中存在高频与预后相关的通路。综上所述,HiFreSP 显示出更高的预后能力和更强的稳健性,鉴定出的特征为癌症预后提供了临床指导。HiFreSP 可在 GitHub 上免费获取:https://github.com/chunquanlipathway/HiFreSP。