Mosquera Orgueira Adrián, Peleteiro Raíndo Andrés, Cid López Miguel, Díaz Arias José Ángel, González Pérez Marta Sonia, Antelo Rodríguez Beatriz, Alonso Vence Natalia, Bao Pérez Laura, Ferreiro Ferro Roi, Albors Ferreiro Manuel, Abuín Blanco Aitor, Fontanes Trabazo Emilia, Cerchione Claudio, Martinnelli Giovanni, Montesinos Fernández Pau, Mateo Pérez Encinas Manuel, Luis Bello López José
University Hospital of Santiago de Compostela (SERGAS), Department of Hematology, Santiago de Compostela, Spain.
Health Research Institute of Santiago de Compostela, Grupo de Investigación en Síndromes Linfoproliferativos, Santiago de Compostela, Spain.
Front Oncol. 2021 Mar 29;11:657191. doi: 10.3389/fonc.2021.657191. eCollection 2021.
Acute Myeloid Leukemia (AML) is a heterogeneous neoplasm characterized by cytogenetic and molecular alterations that drive patient prognosis. Currently established risk stratification guidelines show a moderate predictive accuracy, and newer tools that integrate multiple molecular variables have proven to provide better results. In this report, we aimed to create a new machine learning model of AML survival using gene expression data. We used gene expression data from two publicly available cohorts in order to create and validate a random forest predictor of survival, which we named ST-123. The most important variables in the model were age and the expression of and , two genes previously associated with the biology and prognostication of myeloid neoplasms. This classifier achieved high concordance indexes in the training and validation sets (0.7228 and 0.6988, respectively), and predictions were particularly accurate in patients at the highest risk of death. Additionally, ST-123 provided significant prognostic improvements in patients with high-risk mutations. Our results indicate that survival of patients with AML can be predicted to a great extent by applying machine learning tools to transcriptomic data, and that such predictions are particularly precise among patients with high-risk mutations.
急性髓系白血病(AML)是一种异质性肿瘤,其特征在于细胞遗传学和分子改变,这些改变决定了患者的预后。目前既定的风险分层指南显示出中等的预测准确性,而整合多个分子变量的新工具已被证明能提供更好的结果。在本报告中,我们旨在利用基因表达数据创建一个新的AML生存机器学习模型。我们使用了来自两个公开可用队列的基因表达数据,以创建并验证一个生存随机森林预测模型,我们将其命名为ST - 123。该模型中最重要的变量是年龄以及两个基因的表达,这两个基因先前与髓系肿瘤的生物学特性和预后相关。该分类器在训练集和验证集中均达到了较高的一致性指数(分别为0.7228和0.6988),并且在死亡风险最高的患者中预测尤为准确。此外,ST - 123在具有高危突变的患者中提供了显著的预后改善。我们的结果表明,通过将机器学习工具应用于转录组数据,可以在很大程度上预测AML患者的生存情况,并且这种预测在具有高危突变的患者中尤为精确。