Cardiovascular Outcomes Research Laboratory, University of California, Los Angeles, CA; Computer Science Department, Stanford University, Palo Alto, CA.
Cardiovascular Outcomes Research Laboratory, University of California, Los Angeles, CA.
Surgery. 2024 Aug;176(2):282-288. doi: 10.1016/j.surg.2024.03.051. Epub 2024 May 16.
With the steady rise in health care expenditures, the examination of factors that may influence the costs of care has garnered much attention. Although machine learning models have previously been applied in health economics, their application within cardiac surgery remains limited. We evaluated several machine learning algorithms to model hospitalization costs for coronary artery bypass grafting.
All adult hospitalizations for isolated coronary artery bypass grafting were identified in the 2016 to 2020 Nationwide Readmissions Database. Machine learning models were trained to predict expenditures and compared with traditional linear regression. Given the significance of postoperative length of stay, we additionally developed models excluding postoperative length of stay to uncover other drivers of costs. To facilitate comparison, machine learning classification models were also trained to predict patients in the highest decile of costs. Significant factors associated with high cost were identified using SHapley Additive exPlanations beeswarm plots.
Among 444,740 hospitalizations included for analysis, the median cost of hospitalization in coronary artery bypass grafting patients was $43,103. eXtreme Gradient Boosting most accurately predicted hospitalization costs, with R = 0.519 over the validation set. The top predictive features in the eXtreme Gradient Boosting model included elective procedure status, prolonged mechanical ventilation, new-onset respiratory failure or myocardial infarction, and postoperative length of stay. After removing postoperative length of stay, eXtreme Gradient Boosting remained the most accurate model (R = 0.38). Prolonged ventilation, respiratory failure, and elective status remained important predictive parameters.
Machine learning models appear to accurately model total hospitalization costs for coronary artery bypass grafting. Future work is warranted to uncover other drivers of costs and improve the value of care in cardiac surgery.
随着医疗保健支出的稳步增长,对可能影响医疗成本的因素的研究受到了广泛关注。尽管机器学习模型在卫生经济学中已有应用,但在心脏外科领域的应用仍然有限。我们评估了几种机器学习算法,以建立冠状动脉旁路移植术住院费用模型。
在 2016 年至 2020 年全国再入院数据库中确定所有单纯冠状动脉旁路移植术的成人住院病例。我们训练机器学习模型以预测支出,并与传统线性回归进行比较。鉴于术后住院时间的重要性,我们还开发了不包括术后住院时间的模型,以揭示其他成本驱动因素。为了便于比较,我们还训练机器学习分类模型来预测费用最高的十分位数患者。使用 Shapley Additive exPlanations beeswarm 图识别与高费用相关的显著因素。
在纳入分析的 444740 例住院病例中,冠状动脉旁路移植术患者的住院费用中位数为 43103 美元。极端梯度提升模型对住院费用的预测最为准确,在验证集中的 R 值为 0.519。极端梯度提升模型中的主要预测特征包括择期手术状态、延长机械通气、新发呼吸衰竭或心肌梗死以及术后住院时间。在去除术后住院时间后,极端梯度提升模型仍然是最准确的模型(R 值为 0.38)。延长通气、呼吸衰竭和择期手术状态仍然是重要的预测参数。
机器学习模型似乎可以准确地预测冠状动脉旁路移植术的总住院费用。需要进一步研究以揭示其他成本驱动因素,并提高心脏外科的医疗服务价值。