Department of Pharmacy, Tianjin Medical University General Hospital, Tianjin, China.
School of Mathematics, Tianjin University, Tianjin, China.
Cancer Med. 2023 Sep;12(17):18306-18316. doi: 10.1002/cam4.6428. Epub 2023 Aug 23.
This study aims to develop a risk prediction model for chemotherapy-induced nausea and vomiting (CINV) in cancer patients receiving highly emetogenic chemotherapy (HEC) and identify the variables that have the most significant impact on prediction.
Data from Tianjin Medical University General Hospital were collected and subjected to stepwise data preprocessing. Deep learning algorithms, including deep forest, and typical machine learning algorithms such as support vector machine (SVM), categorical boosting (CatBoost), random forest, decision tree, and neural network were used to develop the prediction model. After training the model and conducting hyperparameter optimization (HPO) through cross-validation in the training set, the performance was evaluated using the test set. Shapley additive explanations (SHAP), partial dependence plot (PDP), and Local Interpretable Model-Agnostic Explanations (LIME) techniques were employed to explain the optimal model. Model performance was assessed using AUC, F1 score, accuracy, specificity, sensitivity, and Brier score.
The deep forest model exhibited good discrimination, outperforming typical machine learning models, with an AUC of 0.850 (95%CI, 0.780-0.919), an F1 score of 0.757, an accuracy of 0.852, a specificity of 0.863, a sensitivity of 0.784, and a Brier score of 0.082. The top five important features in the model were creatinine clearance (Ccr), age, gender, anticipatory nausea and vomiting, and antiemetic regimen. Among these, Ccr had the most significant predictive value. The risk of CINV decreased with increased Ccr and age, while it was higher in the presence of anticipatory nausea and vomiting, female gender, and non-standard antiemetic regimen.
The deep forest model demonstrated good discrimination in predicting the risk of CINV in cancer patients prescribed HEC. Kidney function, as represented by Ccr, played a crucial role in the model's prediction. The clinical application of this predictive tool can help assess individual risks and improve patient care by proactively optimizing the use of antiemetics in cancer patients receiving HEC.
本研究旨在开发一种用于预测接受高致吐性化疗(HEC)的癌症患者化疗所致恶心和呕吐(CINV)风险的预测模型,并确定对预测影响最大的变量。
本研究收集了天津医科大学总医院的数据,并进行了逐步的数据预处理。使用深度学习算法,包括深度森林,以及典型的机器学习算法,如支持向量机(SVM)、分类提升(CatBoost)、随机森林、决策树和神经网络,来开发预测模型。在训练集上通过交叉验证进行模型训练和超参数优化(HPO)后,使用测试集评估性能。使用 Shapley 加性解释(SHAP)、偏依赖图(PDP)和局部可解释模型不可知解释(LIME)技术来解释最优模型。使用 AUC、F1 分数、准确性、特异性、敏感性和 Brier 分数评估模型性能。
深度森林模型表现出良好的区分度,优于典型的机器学习模型,AUC 为 0.850(95%CI,0.780-0.919),F1 分数为 0.757,准确性为 0.852,特异性为 0.863,敏感性为 0.784,Brier 分数为 0.082。模型中最重要的前五个特征是肌酐清除率(Ccr)、年龄、性别、预期性恶心和呕吐以及止吐方案。其中,Ccr 具有最重要的预测价值。CINV 的风险随着 Ccr 和年龄的增加而降低,而在存在预期性恶心和呕吐、女性和非标准止吐方案时则更高。
深度森林模型在预测接受 HEC 的癌症患者 CINV 风险方面表现出良好的区分度。以 Ccr 为代表的肾功能在模型预测中起着关键作用。该预测工具的临床应用可以帮助评估个体风险,并通过积极优化接受 HEC 的癌症患者止吐剂的使用来改善患者的护理。