Department of Pharmacology and Pharmacogenomics Research Center, Inje University College of Medicine, Busan, Korea.
Center for Personalized Precision Medicine of Tuberculosis, Inje University College of Medicine, Busan, Korea.
J Thromb Haemost. 2021 Jul;19(7):1676-1686. doi: 10.1111/jth.15318. Epub 2021 Apr 21.
Personalized warfarin dosing is influenced by various factors including genetic and non-genetic factors. Multiple linear regression (LR) is known as a conventional method to develop predictive models. Recently, machine learning approaches have been extensively implemented for warfarin dosing due to the hypothesis of non-linear association between covariates and stable warfarin dose.
To extend the multiple linear regression algorithm for personalized warfarin dosing in a Korean population and compare with a machine learning--based algorithm.
From this cohort study, we collected information on 650 patients taking warfarin who achieved steady state including demographic information, indications, comorbidities, comedications, habits, and genetic factors. The dataset was randomly split into training set (90%) and test set (10%). The LR and machine learning (gradient boosting machine [GBM]) models were developed on the training set and were evaluated on the test set.
LR and GBM models were comparable in terms of accuracy of ideal dose (75.38% and 73.85%), correlation (0.77 and 0.73), mean absolute error (0.58 mg/day and 0.64 mg/day), and root mean square error (0.82 mg/day and 0.9 mg/day), respectively. VKORC1 genotype, CYP2C9 genotype, age, and weight were the highest contributors and could obtain 80% of maximum performance in both models.
This study shows that our LR and GMB models are satisfactory to predict warfarin dose in our dataset. Both models showed similar performance and feature contribution characteristics. LR may be the appropriate model due to its simplicity and interpretability.
华法林个体化剂量受多种因素影响,包括遗传因素和非遗传因素。多元线性回归(LR)是一种传统的建立预测模型的方法。由于协变量和稳定的华法林剂量之间存在非线性关系的假设,最近机器学习方法已广泛应用于华法林剂量预测。
将多元线性回归算法扩展到韩国人群的华法林个体化剂量,并与基于机器学习的算法进行比较。
从这项队列研究中,我们收集了 650 名服用华法林并达到稳定状态的患者的信息,包括人口统计学信息、适应证、合并症、合并用药、习惯和遗传因素。数据集被随机分为训练集(90%)和测试集(10%)。在训练集上开发了 LR 和机器学习(梯度提升机[GBM])模型,并在测试集上进行了评估。
LR 和 GBM 模型在理想剂量的准确性(75.38%和 73.85%)、相关性(0.77 和 0.73)、平均绝对误差(0.58mg/天和 0.64mg/天)和均方根误差(0.82mg/天和 0.9mg/天)方面相当。VKORC1 基因型、CYP2C9 基因型、年龄和体重是最重要的因素,在两种模型中都可以获得 80%的最大性能。
本研究表明,我们的 LR 和 GBM 模型能够很好地预测我们数据集的华法林剂量。两种模型的性能和特征贡献特征相似。由于其简单性和可解释性,LR 可能是合适的模型。