Department of Bioinformatics and Life Science, Soongsil University, Seoul, 06978, Republic of Korea.
Department of Pathology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, 05505, Republic of Korea.
BMC Cancer. 2020 Nov 2;20(1):1052. doi: 10.1186/s12885-020-07399-8.
Triple-Negative Breast Cancer (TNBC) is an aggressive and complex subtype of breast cancer. The current biomarkers used in the context of breast cancer treatment are highly dependent on the targeting of oestrogen receptor, progesterone receptor, or HER2, resulting in treatment failure and disease recurrence and creating clinical challenges. Thus, there is still a crucial need for the improvement of TNBC treatment; the discovery of effective biomarkers that can be easily translated to the clinics is essential.
We report an approach for the discovery of biomarkers that can predict tumour relapse and pathologic complete response (pCR) in TNBC on the basis of mRNA expression quantified using the NanoString nCounter Immunology Panel. To overcome the limited sample size, prediction models based on random Forest were constructed using the differentially expressed genes (DEGs) as selected features. We also evaluated the differences between pre- and post-treatment groups aiming for the combinatorial assessment of pCR and relapse using additive models in edgeR.
We identify nine and 13 DEGs strongly associated with pCR and relapse, respectively, from 579 immune genes in a small number of samples (n = 55) using edgeR. An additive model for the comparison of pre- and post-treatment groups via the adjustment of the independent subject in the relapse group revealed associations for 41 genes. Comprehensive analysis indicated that our prediction models outperformed those constructed using features extracted from the existing feature selection model Elastic Net in terms of accuracy. The prediction models were assessed using a randomization test to validate the robustness (empirical P for the model of pCR = 0.015 and empirical P for the model of relapse = 0.018). Furthermore, three DEGs (FCER1A, EDNRB, and TGFBI) in the model of relapse showed prognostic significance for predicting the survival of patients with cancer through Cox proportional hazards regression model-based survival analysis.
Gene expression quantified via the NanoString nCounter Immunology Panel can be seamlessly analysed using edgeR, even considering small sample sizes. Our approach provides a scalable framework that can easily be applied for the discovery of biomarkers based on the NanoString nCounter Immunology Panel.
The source code will be available from github at https://github.com/sungheep/nanostring .
三阴性乳腺癌(TNBC)是一种侵袭性和复杂性的乳腺癌亚型。目前在乳腺癌治疗中使用的生物标志物高度依赖于雌激素受体、孕激素受体或 HER2 的靶向治疗,导致治疗失败和疾病复发,并带来临床挑战。因此,仍然迫切需要改进 TNBC 的治疗方法;发现能够容易转化为临床应用的有效生物标志物至关重要。
我们报告了一种基于 NanoString nCounter 免疫面板定量 mRNA 表达的方法,用于发现能够预测 TNBC 肿瘤复发和病理完全缓解(pCR)的生物标志物。为了克服样本量有限的问题,使用随机森林构建了基于差异表达基因(DEGs)作为选择特征的预测模型。我们还评估了治疗前后两组之间的差异,旨在使用 edgeR 中的加性模型对 pCR 和复发进行组合评估。
我们使用 edgeR 在 579 个免疫基因中从少数样本(n=55)中分别鉴定出与 pCR 和复发强相关的 9 个和 13 个 DEG。通过调整复发组中独立个体的加性模型对治疗前后两组进行比较,发现 41 个基因存在关联。综合分析表明,与基于现有特征选择模型 Elastic Net 提取的特征构建的预测模型相比,我们的预测模型在准确性方面表现更优。使用随机化检验评估预测模型的稳健性(pCR 模型的经验 P=0.015,复发模型的经验 P=0.018)。此外,复发模型中的 3 个 DEG(FCER1A、EDNRB 和 TGFBI)通过基于 Cox 比例风险回归模型的生存分析,显示出对癌症患者生存预测的预后意义。
通过 NanoString nCounter 免疫面板定量的基因表达可以通过 edgeR 进行无缝分析,即使考虑到小样本量。我们的方法提供了一个可扩展的框架,可轻松应用于基于 NanoString nCounter 免疫面板的生物标志物发现。
源代码将可在 https://github.com/sungheep/nanostring 上获得。