Institute for Healthcare Delivery Science, Department of Population Health Science and Policy, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA.
Tisch Cancer Institute, Mount Sinai Hospital, New York, NY, 10029, USA.
BMC Health Serv Res. 2020 Apr 25;20(1):350. doi: 10.1186/s12913-020-05148-y.
The Oncology Care Model (OCM) was developed as a payment model to encourage participating practices to provide better-quality care for cancer patients at a lower cost. The risk-adjustment model used in OCM is a Gamma generalized linear model (Gamma GLM) with log-link. The predicted value of expense for the episodes identified for our academic medical center (AMC), based on the model fitted to the national data, did not correlate well with our observed expense. This motivated us to fit the Gamma GLM to our AMC data and compare it with two other flexible modeling methods: Random Forest (RF) and Partially Linear Additive Quantile Regression (PLAQR). We also performed a simulation study to assess comparative performance of these methods and examined the impact of non-linearity and interaction effects, two understudied aspects in the field of cost prediction.
The simulation was designed with an outcome of cost generated from four distributions: Gamma, Weibull, Log-normal with a heteroscedastic error term, and heavy-tailed. Simulation parameters both similar to and different from OCM data were considered. The performance metrics considered were the root mean square error (RMSE), mean absolute prediction error (MAPE), and cost accuracy (CA). Bootstrap resampling was utilized to estimate the operating characteristics of the performance metrics, which were described by boxplots.
RF attained the best performance with lowest RMSE, MAPE, and highest CA for most of the scenarios. When the models were misspecified, their performance was further differentiated. Model performance differed more for non-exponential than exponential outcome distributions.
RF outperformed Gamma GLM and PLAQR in predicting overall and top decile costs. RF demonstrated improved prediction under various scenarios common in healthcare cost modeling. Additionally, RF did not require prespecification of outcome distribution, nonlinearity effect, or interaction terms. Therefore, RF appears to be the best tool to predict average cost. However, when the goal is to estimate extreme expenses, e.g., high cost episodes, the accuracy gained by RF versus its computational costs may need to be considered.
肿瘤学护理模式(Oncology Care Model,OCM)是作为一种支付模式而开发的,旨在鼓励参与实践为癌症患者提供更高质量的护理,同时降低成本。OCM 中使用的风险调整模型是具有对数链接的伽马广义线性模型(Gamma generalized linear model,Gamma GLM)。根据该模型拟合全国数据得出的针对我们学术医疗中心(academic medical center,AMC)的病例的费用预测值与我们观察到的费用相关性较差。这促使我们根据 AMC 数据拟合 Gamma GLM,并将其与另外两种灵活的建模方法进行比较:随机森林(Random Forest,RF)和部分线性加性分位数回归(Partially Linear Additive Quantile Regression,PLAQR)。我们还进行了一项模拟研究,以评估这些方法的比较性能,并研究了成本预测领域中研究较少的非线性和交互作用效应的影响。
模拟结果是从四个分布中生成的成本:伽马分布、威布尔分布、对数正态分布(具有异方差误差项)和重尾分布。同时考虑了与 OCM 数据相似和不同的模拟参数。考虑的性能指标包括均方根误差(root mean square error,RMSE)、平均绝对预测误差(mean absolute prediction error,MAPE)和成本准确性(cost accuracy,CA)。使用自举重采样来估计性能指标的操作特征,这些特征由箱线图描述。
在大多数情况下,RF 模型通过最低的 RMSE、MAPE 和最高的 CA 获得了最佳性能。当模型被错误指定时,其性能进一步得到区分。对于非指数分布的情况,模型性能的差异大于指数分布的情况。
RF 在预测总体和最高十分位数成本方面优于 Gamma GLM 和 PLAQR。在各种常见的医疗保健成本建模场景中,RF 显示出改进的预测性能。此外,RF 不需要预先指定结果分布、非线性效应或交互项。因此,RF 似乎是预测平均成本的最佳工具。然而,当目标是估计极端费用(例如高成本病例)时,RF 相对于其计算成本的准确性增益可能需要考虑。