Suppr超能文献

利用机器学习和全国性关联社会管理数据集预测心血管疾病患者中的高医疗费用使用者。

Predicting high health-cost users among people with cardiovascular disease using machine learning and nationwide linked social administrative datasets.

作者信息

Nghiem Nhung, Atkinson June, Nguyen Binh P, Tran-Duy An, Wilson Nick

机构信息

Department of Public Health, University of Otago, Wellington, New Zealand.

School of Mathematics and Statistics, Victoria University of Wellington, Wellington, New Zealand.

出版信息

Health Econ Rev. 2023 Feb 4;13(1):9. doi: 10.1186/s13561-023-00422-1.

Abstract

OBJECTIVES

To optimise planning of public health services, the impact of high-cost users needs to be considered. However, most of the existing statistical models for costs do not include many clinical and social variables from administrative data that are associated with elevated health care resource use, and are increasingly available. This study aimed to use machine learning approaches and big data to predict high-cost users among people with cardiovascular disease (CVD).

METHODS

We used nationally representative linked datasets in New Zealand to predict CVD prevalent cases with the most expensive cost belonging to the top quintiles by cost. We compared the performance of four popular machine learning models (L1-regularised logistic regression, classification trees, k-nearest neighbourhood (KNN) and random forest) with the traditional regression models.

RESULTS

The machine learning models had far better accuracy in predicting high health-cost users compared with the logistic models. The harmony score F1 (combining sensitivity and positive predictive value) of the machine learning models ranged from 30.6% to 41.2% (compared with 8.6-9.1% for the logistic models). Previous health costs, income, age, chronic health conditions, deprivation, and receiving a social security benefit were among the most important predictors of the CVD high-cost users.

CONCLUSIONS

This study provides additional evidence that machine learning can be used as a tool together with big data in health economics for identification of new risk factors and prediction of high-cost users with CVD. As such, machine learning may potentially assist with health services planning and preventive measures to improve population health while potentially saving healthcare costs.

摘要

目的

为优化公共卫生服务规划,需要考虑高成本使用者的影响。然而,现有的大多数成本统计模型并未纳入行政数据中许多与医疗保健资源使用增加相关且日益可得的临床和社会变量。本研究旨在使用机器学习方法和大数据来预测心血管疾病(CVD)患者中的高成本使用者。

方法

我们使用了新西兰具有全国代表性的关联数据集,以预测成本属于最高五分之一的最昂贵成本的CVD现患病例。我们将四种流行的机器学习模型(L1正则化逻辑回归、分类树、k近邻(KNN)和随机森林)的性能与传统回归模型进行了比较。

结果

与逻辑模型相比,机器学习模型在预测高医疗成本使用者方面具有更高的准确性。机器学习模型的和谐分数F1(结合敏感性和阳性预测值)范围为30.6%至41.2%(逻辑模型为8.6 - 9.1%)。既往医疗成本、收入、年龄、慢性健康状况、贫困程度以及领取社会保障福利是CVD高成本使用者最重要的预测因素。

结论

本研究提供了更多证据,表明机器学习可作为一种工具,与大数据一起用于健康经济学中识别新的风险因素和预测CVD高成本使用者。因此,机器学习可能有助于卫生服务规划和预防措施,以改善人群健康,同时潜在地节省医疗成本。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c584/9898915/a1108230ccfb/13561_2023_422_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验