School of Mathematics and Statistics, Hanshan Normal University, Chaozhou, China.
Institute of Paediatrics, Guangzhou Women and Children's Medical Centre, Guangzhou Medical University, Guangzhou, China.
Cancer Med. 2021 Jun;10(11):3782-3793. doi: 10.1002/cam4.3842. Epub 2021 May 13.
Relapsed acute lymphoblastic leukaemia (ALL) remains a prevalent paediatric cancer and one of the most common causes of mortality from malignancy in children. Tailoring the intensity of therapy according to early stratification is a promising strategy but remains a major challenge due to heterogeneity and subtyping difficulty. In this study, we subgroup B-precursor ALL patients by gene expression profiles, using non-negative matrix factorization and minimum description length which unsupervisedly determines the number of subgroups. Within each of the four subgroups, logistic and Cox regression with elastic net regularization are used to build models predicting minimal residual disease (MRD) and relapse-free survival (RFS) respectively. Measured by area under the receiver operating characteristic curve (AUC), subgrouping improves prediction of MRD in one subgroup which mostly overlaps with subtype TCF3-PBX1 (AUC = 0·986 in the training set and 1·0 in the test set), compared to a global model published previously. The models predicting RFS displayed acceptable concordance in training set and discriminate high-relapse-risk patients in three subgroups of the test set (Wilcoxon test p = 0·048, 0·036, and 0·016). Genes playing roles in the models are specific to different subgroups. The improvement of subgrouped MRD prediction and the differences of genes in prediction models of subgroups suggest that the heterogeneity of B-precursor ALL can be handled by subgrouping according to gene expression profiles to improve the prediction accuracy.
复发急性淋巴细胞白血病(ALL)仍然是一种常见的儿科癌症,也是儿童恶性肿瘤死亡的最常见原因之一。根据早期分层来调整治疗强度是一种很有前途的策略,但由于异质性和亚型划分困难,这仍然是一个主要挑战。在这项研究中,我们通过基因表达谱将 B 前体 ALL 患者进行亚组分类,使用非负矩阵分解和最小描述长度来无监督地确定亚组数量。在四个亚组中的每一个亚组中,使用逻辑回归和 Cox 回归与弹性网络正则化来建立预测微小残留病(MRD)和无复发生存(RFS)的模型。通过接收者操作特征曲线下的面积(AUC)来衡量,与之前发表的全球模型相比,亚组分类在一个主要与 TCF3-PBX1 亚型重叠的亚组中提高了 MRD 预测的准确性(在训练集中 AUC 为 0.986,在测试集中 AUC 为 1.0)。在训练集中,预测 RFS 的模型显示出可接受的一致性,并在测试集中的三个亚组中区分出高复发风险患者(Wilcoxon 检验 p=0.048、0.036 和 0.016)。在模型中起作用的基因特定于不同的亚组。MRD 预测的亚组化改善和亚组预测模型中基因的差异表明,根据基因表达谱进行亚组分类可以处理 B 前体 ALL 的异质性,从而提高预测准确性。