Graduate Institute of Business Administration, Fu Jen Catholic University, New Taipei City 242062, Taiwan.
Department of Neurology, Fu Jen Catholic University Hospital, Fu Jen Catholic University, New Taipei City 24352, Taiwan.
Int J Environ Res Public Health. 2023 Jan 29;20(3):2359. doi: 10.3390/ijerph20032359.
The new generation of nonvitamin K antagonists are broadly applied for stroke prevention due to their notable efficacy and safety. Our study aimed to develop a suggestive utilization of dabigatran through an integrated machine learning (ML) decision-tree model. Participants taking different doses of dabigatran in the Randomized Evaluation of Long-Term Anticoagulant Therapy trial were included in our analysis and defined as the 110 mg and 150 mg groups. The proposed scheme integrated ML methods, namely naive Bayes, random forest (RF), classification and regression tree (CART), and extreme gradient boosting (XGBoost), which were used to identify the essential variables for predicting vascular events in the 110 mg group and bleeding in the 150 mg group. RF (0.764 for 110 mg; 0.747 for 150 mg) and XGBoost (0.708 for 110 mg; 0.761 for 150 mg) had better area under the receiver operating characteristic curve (AUC) values than logistic regression (benchmark model; 0.683 for 110 mg; 0.739 for 150 mg). We then selected the top ten important variables as internal nodes of the CART decision tree. The two best CART models with ten important variables output tree-shaped rules for predicting vascular events in the 110 mg group and bleeding in the 150 mg group. Our model can be used to provide more visualized and interpretable suggestive rules to clinicians managing NVAF patients who are taking dabigatran.
新一代非维生素 K 拮抗剂由于其显著的疗效和安全性,被广泛应用于卒中预防。我们旨在通过整合机器学习(ML)决策树模型,为达比加群的合理使用提供建议。我们的分析纳入了接受不同剂量达比加群治疗的随机评价长期抗凝治疗试验(Randomized Evaluation of Long-Term Anticoagulant Therapy trial)参与者,并将其定义为 110mg 和 150mg 组。所提出的方案整合了 ML 方法,包括朴素贝叶斯、随机森林(Random Forest,RF)、分类回归树(Classification and Regression Tree,CART)和极端梯度提升(Extreme Gradient Boosting,XGBoost),用于识别预测 110mg 组血管事件和 150mg 组出血的关键变量。RF(110mg 组为 0.764;150mg 组为 0.747)和 XGBoost(110mg 组为 0.708;150mg 组为 0.761)的曲线下接收者操作特征(receiver operating characteristic,ROC)面积值优于逻辑回归(基准模型;110mg 组为 0.683;150mg 组为 0.739)。然后,我们选择前十个重要变量作为 CART 决策树的内部节点。具有十个重要变量的两个最佳 CART 模型输出了预测 110mg 组血管事件和 150mg 组出血的树状规则。我们的模型可用于为管理接受达比加群治疗的 NVAF 患者的临床医生提供更可视化和可解释的建议规则。