Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, USA.
Department of Neurology, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, USA.
Pharmacoepidemiol Drug Saf. 2022 Apr;31(4):393-403. doi: 10.1002/pds.5391. Epub 2021 Dec 28.
Fluoroquinolones are associated with central (CNS) and peripheral (PNS) nervous system symptoms, and predicting the risk of these outcomes may have important clinical implications. Both LASSO and random forest are appealing modeling methods, yet it is not clear which method performs better for clinical risk prediction.
To compare models developed using LASSO versus random forest for predicting neurological dysfunction among fluoroquinolone users.
We developed and validated risk prediction models using claims data from a commercially insured population. The study cohort included adults dispensed an oral fluoroquinolone, and outcomes were CNS and PNS dysfunction. Model predictors included demographic variables, comorbidities and medications known to be associated with neurological symptoms, and several healthcare utilization predictors. We assessed the accuracy and calibration of these models using measures including AUC, calibration curves, and Brier scores.
The underlying cohort contained 16 533 (1.18%) individuals with CNS dysfunction and 46 995 (3.34%) individuals with PNS dysfunction during 120 days of follow-up. For CNS dysfunction, LASSO had an AUC of 0.81 (95% CI: 0.80, 0.82), while random forest had an AUC of 0.80 (95% CI: 0.80, 0.81). For PNS dysfunction, LASSO had an AUC of 0.75 (95% CI: 0.74, 0.76) versus an AUC of 0.73 (95% CI: 0.73, 0.74) for random forest. Both LASSO models had better calibration, with Brier scores 0.17 (LASSO) versus 0.20 (random forest) for CNS dysfunction and 0.20 (LASSO) versus 0.25 (random forest) for PNS dysfunction.
LASSO outperformed random forest in predicting CNS and PNS dysfunction among fluoroquinolone users, and should be considered for modeling when the cohort is modest in size, when the number of model predictors is modest, and when predictors are primarily binary.
氟喹诺酮类药物与中枢神经系统(CNS)和周围神经系统(PNS)症状有关,预测这些结果的风险可能具有重要的临床意义。LASSO 和随机森林都是很有吸引力的建模方法,但尚不清楚哪种方法更适合临床风险预测。
比较 LASSO 和随机森林在预测氟喹诺酮类药物使用者神经功能障碍方面的模型。
我们使用商业保险人群的索赔数据开发和验证风险预测模型。研究队列包括接受口服氟喹诺酮类药物治疗的成年人,结局为 CNS 和 PNS 功能障碍。模型预测因子包括与神经系统症状相关的人口统计学变量、合并症和药物,以及几个医疗保健利用预测因子。我们使用 AUC、校准曲线和 Brier 评分等指标评估这些模型的准确性和校准度。
基础队列包含 16533 名(1.18%)CNS 功能障碍患者和 46995 名(3.34%)PNS 功能障碍患者,在 120 天的随访期间。对于 CNS 功能障碍,LASSO 的 AUC 为 0.81(95%CI:0.80,0.82),而随机森林的 AUC 为 0.80(95%CI:0.80,0.81)。对于 PNS 功能障碍,LASSO 的 AUC 为 0.75(95%CI:0.74,0.76),而随机森林的 AUC 为 0.73(95%CI:0.73,0.74)。LASSO 模型的校准度均较好,CNS 功能障碍的 Brier 评分分别为 0.17(LASSO)和 0.20(随机森林),PNS 功能障碍的 Brier 评分分别为 0.20(LASSO)和 0.25(随机森林)。
在预测氟喹诺酮类药物使用者的 CNS 和 PNS 功能障碍方面,LASSO 优于随机森林,当队列规模适中、模型预测因子数量适中且预测因子主要为二进制时,应考虑使用 LASSO 进行建模。