Biomimetics and Intelligent Systems Group, University of Oulu, Pentti Kaiteran katu 1, 90570, Oulu, Finland.
Division of Pharmacoepidemiology and Clinical Pharmacology, Utrecht Institute for Pharmaceutical Sciences, Utrecht University, Utrecht, Netherlands.
BMC Med Res Methodol. 2024 May 11;24(1):112. doi: 10.1186/s12874-024-02237-y.
Orphan diseases, exemplified by T-cell prolymphocytic leukemia, present inherent challenges due to limited data availability and complexities in effective care. This study delves into harnessing the potential of machine learning to enhance care strategies for orphan diseases, specifically focusing on allogeneic hematopoietic cell transplantation (allo-HCT) in T-cell prolymphocytic leukemia. The investigation evaluates how varying numbers of variables impact model performance, considering the rarity of the disease. Utilizing data from the Center for International Blood and Marrow Transplant Research, the study scrutinizes outcomes following allo-HCT for T-cell prolymphocytic leukemia. Diverse machine learning models were developed to forecast acute graft-versus-host disease (aGvHD) occurrence and its distinct grades post-allo-HCT. Assessment of model performance relied on balanced accuracy, F1 score, and ROC AUC metrics. The findings highlight the Linear Discriminant Analysis (LDA) classifier achieving the highest testing balanced accuracy of 0.58 in predicting aGvHD. However, challenges arose in its performance during multi-class classification tasks. While affirming the potential of machine learning in enhancing care for orphan diseases, the study underscores the impact of limited data and disease rarity on model performance.
孤儿病,以 T 细胞前淋巴细胞白血病为例,由于数据有限且有效护理复杂,存在固有挑战。本研究探讨了利用机器学习的潜力来增强孤儿病的护理策略,特别是针对 T 细胞前淋巴细胞白血病的异基因造血细胞移植(allo-HCT)。该研究评估了在考虑疾病罕见性的情况下,不同数量的变量如何影响模型性能。该研究利用国际血液和骨髓移植研究中心的数据,分析了 T 细胞前淋巴细胞白血病患者接受 allo-HCT 后的结果。开发了多种机器学习模型来预测 allo-HCT 后急性移植物抗宿主病(aGvHD)的发生及其不同严重程度。模型性能的评估依赖于平衡准确性、F1 得分和 ROC AUC 指标。研究结果表明,线性判别分析(LDA)分类器在预测 aGvHD 方面的测试平衡准确性最高,为 0.58。然而,在进行多类分类任务时,其性能存在挑战。虽然肯定了机器学习在增强孤儿病护理方面的潜力,但该研究强调了数据有限和疾病罕见性对模型性能的影响。