School of Nursing, Jilin University, No.965 Xinjiang Street, Changchun, 130021, China.
Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, 611731, China.
Int J Med Inform. 2024 Dec;192:105630. doi: 10.1016/j.ijmedinf.2024.105630. Epub 2024 Sep 14.
New cases of lymphoma are rising, and the symptom burden, like cancer-related fatigue (CRF), severely impacts the quality of life of lymphoma survivors. However, clinical diagnosis and treatment of CRF are inadequate and require enhancement.
The main objective of this study is to construct machine learning-based CRF prediction models for lymphoma survivors to help healthcare professionals accurately identify the CRF population and better personalize treatment and care for patients.
A cross-sectional study in China recruited lymphoma patients from June 2023 to March 2024, dividing them into two datasets for model construction and external validation. Six machine learning algorithms were used in this study: Logistic Regression (LR), Random Forest, Single Hidden Layer Neural Network, Support Vector Machine, eXtreme Gradient Boosting, and Light Gradient Boosting Machine (LightGBM). Performance metrics like the area under the receiver operating characteristic (AUROC) and calibration curves were compared. The clinical applicability was assessed by decision curve, and Shapley additive explanations was employed to explain variable significance.
CRF incidence was 40.7 % (dataset I) and 44.8 % (dataset II). LightGBM showed strong performance in training and internal validation. LR excelled in external validation with the highest AUROC and best calibration. Pain, total protein, physical function, and sleep disturbance were important predictors of CRF.
The study presents a machine learning-based CRF prediction model for lymphoma patients, offering dynamic, data-driven assessments that could enhance the development of automated CRF screening tools for personalized management in clinical practice.
淋巴瘤新发病例不断增加,其症状负担(如癌因性疲乏)严重影响淋巴瘤幸存者的生活质量。然而,目前对癌因性疲乏的临床诊断和治疗不足,需要进一步加强。
本研究旨在构建基于机器学习的淋巴瘤幸存者癌因性疲乏预测模型,帮助临床医生准确识别癌因性疲乏人群,为患者提供更加个体化的治疗和护理。
本研究采用横断面研究设计,于 2023 年 6 月至 2024 年 3 月在中国招募淋巴瘤患者,将其分为模型构建数据集和外部验证数据集。本研究共纳入 6 种机器学习算法:Logistic Regression(LR)、Random Forest(RF)、Single Hidden Layer Neural Network(SNN)、Support Vector Machine(SVM)、eXtreme Gradient Boosting(XGBoost)和 Light Gradient Boosting Machine(LightGBM)。通过比较受试者工作特征曲线下面积(AUROC)和校准曲线等性能指标,评估模型的表现。通过决策曲线评估模型的临床适用性,采用 Shapley 加性解释法解释变量的重要性。
CRF 的发生率为 40.7%(数据集 I)和 44.8%(数据集 II)。LightGBM 在训练和内部验证中表现出良好的性能,LR 在外部验证中表现出最高的 AUROC 和最佳的校准效果。疼痛、总蛋白、身体功能和睡眠障碍是 CRF 的重要预测因素。
本研究构建了一种基于机器学习的淋巴瘤患者癌因性疲乏预测模型,为临床实践中开发自动化的癌因性疲乏筛查工具提供了动态、数据驱动的评估方法,有助于实现个体化管理。