Shyr David, Zhang Bing M, Saini Gopin, Brewer Simon C
Department of Pediatrics, Division of Pediatric Hematology/Oncology, Section of Stem Cell Transplant, Stanford University, Stanford, CA 94305, USA.
Department of Pathology, Stanford University School of Medicine, Stanford, CA 94305, USA.
J Clin Med. 2024 Jul 10;13(14):4021. doi: 10.3390/jcm13144021.
. Leukemic relapse remains the primary cause of treatment failure and death after allogeneic hematopoietic stem cell transplant. Changes in post-transplant donor chimerism have been identified as a predictor of relapse. A better predictive model of relapse incorporating donor chimerism has the potential to improve leukemia-free survival by allowing earlier initiation of post-transplant treatment on individual patients. We explored the use of machine learning, a suite of analytical methods focusing on pattern recognition, to improve post-transplant relapse prediction. . Using a cohort of 63 pediatric patients with acute lymphocytic leukemia (ALL) and 46 patients with acute myeloid leukemia (AML) who underwent stem cell transplant at a single institution, we built predictive models of leukemic relapse with both pre-transplant and post-transplant patient variables (specifically lineage-specific chimerism) using the random forest classifier. Local Interpretable Model-Agnostic Explanations, an interpretable machine learning tool was used to confirm our random forest classification result. . Our analysis showed that a random forest model using these hyperparameter values achieved 85% accuracy, 85% sensitivity, 89% specificity for ALL, while for AML 81% accuracy, 75% sensitivity, and 100% specificity at predicting relapses within 24 months post-HSCT in cross validation. The Local Interpretable Model-Agnostic Explanations tool was able to confirm many variables that the random forest classifier identified as important for the relapse prediction. . Machine learning methods can reveal the interaction of different risk factors of post-transplant leukemic relapse and robust predictions can be obtained even with a modest clinical dataset. The random forest classifier distinguished different important predictive factors between ALL and AML in our relapse models, consistent with previous knowledge, lending increased confidence to adopting machine learning prediction to clinical management.
白血病复发仍然是异基因造血干细胞移植后治疗失败和死亡的主要原因。移植后供体嵌合状态的变化已被确定为复发的预测指标。一个纳入供体嵌合状态的更好的复发预测模型有可能通过允许对个体患者更早地开始移植后治疗来提高无白血病生存率。我们探索了使用机器学习(一套专注于模式识别的分析方法)来改善移植后复发预测。
我们使用了一组63例在单一机构接受干细胞移植的小儿急性淋巴细胞白血病(ALL)患者和46例急性髓系白血病(AML)患者,使用随机森林分类器,利用移植前和移植后患者变量(特别是谱系特异性嵌合状态)建立白血病复发预测模型。使用局部可解释模型无关解释(一种可解释的机器学习工具)来确认我们的随机森林分类结果。
我们的分析表明,在交叉验证中,使用这些超参数值的随机森林模型在预测HSCT后24个月内复发时,ALL的准确率为85%、灵敏度为85%、特异性为89%,而AML的准确率为81%、灵敏度为75%、特异性为100%。局部可解释模型无关解释工具能够确认随机森林分类器确定为对复发预测重要的许多变量。
机器学习方法可以揭示移植后白血病复发不同危险因素之间的相互作用,即使使用适度的临床数据集也能获得可靠的预测。在我们的复发模型中,随机森林分类器区分了ALL和AML之间不同的重要预测因素,这与先前的知识一致,增加了将机器学习预测应用于临床管理的信心。