Liu Rong, Li Xi, Zhang Wei, Zhou Hong-Hao
Department of Clinical Pharmacology, Xiangya Hospital, Central South University, Changsha, P. R. China; Institute of Clinical Pharmacology, Central South University, Hunan Key Laboratory of Pharmacogenetics, Changsha, P. R. China.
PLoS One. 2015 Aug 25;10(8):e0135784. doi: 10.1371/journal.pone.0135784. eCollection 2015.
Multiple linear regression (MLR) and machine learning techniques in pharmacogenetic algorithm-based warfarin dosing have been reported. However, performances of these algorithms in racially diverse group have never been objectively evaluated and compared. In this literature-based study, we compared the performances of eight machine learning techniques with those of MLR in a large, racially-diverse cohort.
MLR, artificial neural network (ANN), regression tree (RT), multivariate adaptive regression splines (MARS), boosted regression tree (BRT), support vector regression (SVR), random forest regression (RFR), lasso regression (LAR) and Bayesian additive regression trees (BART) were applied in warfarin dose algorithms in a cohort from the International Warfarin Pharmacogenetics Consortium database. Covariates obtained by stepwise regression from 80% of randomly selected patients were used to develop algorithms. To compare the performances of these algorithms, the mean percentage of patients whose predicted dose fell within 20% of the actual dose (mean percentage within 20%) and the mean absolute error (MAE) were calculated in the remaining 20% of patients. The performances of these techniques in different races, as well as the dose ranges of therapeutic warfarin were compared. Robust results were obtained after 100 rounds of resampling.
BART, MARS and SVR were statistically indistinguishable and significantly out performed all the other approaches in the whole cohort (MAE: 8.84-8.96 mg/week, mean percentage within 20%: 45.88%-46.35%). In the White population, MARS and BART showed higher mean percentage within 20% and lower mean MAE than those of MLR (all p values < 0.05). In the Asian population, SVR, BART, MARS and LAR performed the same as MLR. MLR and LAR optimally performed among the Black population. When patients were grouped in terms of warfarin dose range, all machine learning techniques except ANN and LAR showed significantly higher mean percentage within 20%, and lower MAE (all p values < 0.05) than MLR in the low- and high- dose ranges.
Overall, machine learning-based techniques, BART, MARS and SVR performed superior than MLR in warfarin pharmacogenetic dosing. Differences of algorithms' performances exist among the races. Moreover, machine learning-based algorithms tended to perform better in the low- and high- dose ranges than MLR.
已有报道称在基于药物遗传学算法的华法林剂量计算中应用了多元线性回归(MLR)和机器学习技术。然而,这些算法在不同种族群体中的性能从未得到客观评估和比较。在这项基于文献的研究中,我们在一个种族多样化的大型队列中比较了八种机器学习技术与MLR的性能。
将MLR、人工神经网络(ANN)、回归树(RT)、多元自适应回归样条(MARS)、增强回归树(BRT)、支持向量回归(SVR)、随机森林回归(RFR)、套索回归(LAR)和贝叶斯加法回归树(BART)应用于国际华法林药物遗传学联盟数据库队列中的华法林剂量算法。通过对80%随机选择患者进行逐步回归获得的协变量用于开发算法。为了比较这些算法的性能,在其余20%的患者中计算预测剂量落在实际剂量20%范围内的患者的平均百分比(20%内平均百分比)和平均绝对误差(MAE)。比较了这些技术在不同种族中的性能以及治疗性华法林的剂量范围。经过100轮重采样后获得了稳健的结果。
在整个队列中,BART、MARS和SVR在统计学上无显著差异,且显著优于所有其他方法(MAE:8.84 - 8.96毫克/周,20%内平均百分比:45.88% - 46.35%)。在白人人群中,MARS和BART的20%内平均百分比高于MLR,平均MAE低于MLR(所有p值<0.05)。在亚洲人群中,SVR、BART、MARS和LAR的表现与MLR相同。在黑人人群中,MLR和LAR表现最佳。当根据华法林剂量范围对患者进行分组时,除ANN和LAR外,所有机器学习技术在低剂量和高剂量范围内的20%内平均百分比均显著高于MLR,且MAE更低(所有p值<0.05)。
总体而言,在华法林药物遗传学剂量计算中,基于机器学习的技术BART、MARS和SVR的表现优于MLR。算法性能在不同种族之间存在差异。此外,基于机器学习的算法在低剂量和高剂量范围内的表现往往优于MLR。