UKRI Centre for Doctoral Training in Artificial Intelligence for Medical Diagnosis and Care, University of Leeds, Leeds, United Kingdom.
School of Computing, University of Leeds, Leeds, United Kingdom.
JCO Clin Cancer Inform. 2024 Apr;8:e2300264. doi: 10.1200/CCI.23.00264.
Adverse effects of chemotherapy often require hospital admissions or treatment management. Identifying factors contributing to unplanned hospital utilization may improve health care quality and patients' well-being. This study aimed to assess if patient-reported outcome measures (PROMs) improve performance of machine learning (ML) models predicting hospital admissions, triage events (contacting helpline or attending hospital), and changes to chemotherapy.
Clinical trial data were used and contained responses to three PROMs (European Organisation for Research and Treatment of Cancer Core Quality of Life Questionnaire [QLQ-C30], EuroQol Five-Dimensional Visual Analogue Scale [EQ-5D], and Functional Assessment of Cancer Therapy-General [FACT-G]) and clinical information on 508 participants undergoing chemotherapy. Six feature sets (with following variables: [1] all available; [2] clinical; [3] PROMs; [4] clinical and QLQ-C30; [5] clinical and EQ-5D; [6] clinical and FACT-G) were applied in six ML models (logistic regression [LR], decision tree, adaptive boosting, random forest [RF], support vector machines [SVMs], and neural network) to predict admissions, triage events, and chemotherapy changes.
The comprehensive analysis of predictive performances of the six ML models for each feature set in three different methods for handling class imbalance indicated that PROMs improved predictions of all outcomes. RF and SVMs had the highest performance for predicting admissions and changes to chemotherapy in balanced data sets, and LR in imbalanced data set. Balancing data led to the best performance compared with imbalanced data set or data set with balanced train set only.
These results endorsed the view that ML can be applied on PROM data to predict hospital utilization and chemotherapy management. If further explored, this study may contribute to health care planning and treatment personalization. Rigorous comparison of model performance affected by different imbalanced data handling methods shows best practice in ML research.
化疗的不良反应通常需要住院治疗或进行治疗管理。确定导致非计划性住院利用的因素可能会提高医疗质量和患者的幸福感。本研究旨在评估患者报告结局测量(PROMs)是否能提高预测住院、分诊事件(联系热线或住院)和化疗改变的机器学习(ML)模型的性能。
使用临床试验数据,该数据包含 508 名接受化疗的参与者对三个 PROMs(欧洲癌症研究与治疗组织核心生活质量问卷[QLQ-C30]、欧洲五维健康量表[EQ-5D]和癌症治疗功能评估-一般[FACT-G])的回答以及临床信息。六个特征集(以下变量:[1]全部可用;[2]临床;[3]PROMs;[4]临床和 QLQ-C30;[5]临床和 EQ-5D;[6]临床和 FACT-G)应用于六个 ML 模型(逻辑回归[LR]、决策树、自适应提升、随机森林[RF]、支持向量机[SVM]和神经网络),以预测住院、分诊事件和化疗变化。
在三种处理类别不平衡的方法中,对六个 ML 模型的每个特征集的预测性能的综合分析表明,PROMs 改善了所有结果的预测。在平衡数据集上,RF 和 SVM 对住院和化疗变化的预测性能最高,而 LR 在不平衡数据集上的性能最高。与不平衡数据集或仅平衡训练集的数据集相比,平衡数据集导致了最佳性能。
这些结果支持这样一种观点,即 ML 可以应用于 PROM 数据来预测医院利用和化疗管理。如果进一步研究,本研究可能有助于医疗保健规划和治疗个性化。严格比较不同不平衡数据处理方法对模型性能的影响,展示了 ML 研究中的最佳实践。