替考拉宁谷浓度预测模型的构建与解读 机器学习

Construction and Interpretation of Prediction Model of Teicoplanin Trough Concentration Machine Learning.

作者信息

Ma Pan, Liu Ruixiang, Gu Wenrui, Dai Qing, Gan Yu, Cen Jing, Shang Shenglan, Liu Fang, Chen Yongchuan

机构信息

Department of Pharmacy, The First Affiliated Hospital of Third Military Medical University (Army Medical University), Chongqing, China.

Department of Clinical Pharmacy, General Hospital of Central Theater Command of PLA, Wuhan, China.

出版信息

Front Med (Lausanne). 2022 Mar 8;9:808969. doi: 10.3389/fmed.2022.808969. eCollection 2022.

Abstract

OBJECTIVE

To establish an optimal model to predict the teicoplanin trough concentrations by machine learning, and explain the feature importance in the prediction model using the SHapley Additive exPlanation (SHAP) method.

METHODS

A retrospective study was performed on 279 therapeutic drug monitoring (TDM) measurements obtained from 192 patients who were treated with teicoplanin intravenously at the First Affiliated Hospital of Army Medical University from November 2017 to July 2021. This study included 27 variables, and the teicoplanin trough concentrations were considered as the target variable. The whole dataset was divided into a training group and testing group at the ratio of 8:2, and predictive performance was compared among six different algorithms. Algorithms with higher model performance (top 3) were selected to establish the ensemble prediction model and SHAP was employed to interpret the model.

RESULTS

Three algorithms (SVR, GBRT, and RF) with high scores (0.676, 0.670, and 0.656, respectively) were selected to construct the ensemble model at the ratio of 6:3:1. The model with = 0.720, MAE = 3.628, MSE = 22.571, absolute accuracy of 83.93%, and relative accuracy of 60.71% was obtained, which performed better in model fitting and had better prediction accuracy than any single algorithm. The feature importance and direction of each variable were visually demonstrated by SHAP values, in which teicoplanin administration and renal function were the most important factors.

CONCLUSION

We firstly adopted a machine learning approach to predict the teicoplanin trough concentration, and interpreted the prediction model by the SHAP method, which is of great significance and value for the clinical medication guidance.

摘要

目的

通过机器学习建立预测替考拉宁谷浓度的最优模型,并使用SHapley加性解释(SHAP)方法解释预测模型中的特征重要性。

方法

对陆军军医大学第一附属医院2017年11月至2021年7月期间192例接受静脉注射替考拉宁治疗的患者的279次治疗药物监测(TDM)测量值进行回顾性研究。本研究纳入27个变量,将替考拉宁谷浓度视为目标变量。将整个数据集按8:2的比例分为训练组和测试组,比较六种不同算法的预测性能。选择模型性能较高(排名前3)的算法建立集成预测模型,并采用SHAP解释模型。

结果

选择三种得分较高的算法(支持向量回归(SVR)、梯度提升回归树(GBRT)和随机森林(RF),得分分别为0.676、0.670和0.656)按6:3:1的比例构建集成模型。得到的模型的决定系数(R²)=0.720,平均绝对误差(MAE)=3.628,均方误差(MSE)=22.571,绝对准确率为83.93%,相对准确率为60.71%,在模型拟合方面表现更好,预测准确率高于任何单一算法。通过SHAP值直观展示了每个变量的特征重要性和方向,其中替考拉宁给药和肾功能是最重要的因素。

结论

我们首次采用机器学习方法预测替考拉宁谷浓度,并通过SHAP方法解释预测模型,对临床用药指导具有重要意义和价值。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9d3d/8963816/ae5510e7165a/fmed-09-808969-g0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索