Sato Jumpei, Mitsutake Naohiro, Kitsuregawa Masaru, Ishikawa Tomoki, Goda Kazuo
Institute of Industrial Science, The University of Tokyo.
Institute for Health Economics and Policy.
Environ Health Prev Med. 2022;27:42. doi: 10.1265/ehpm.22-00084.
Driven by the rapid aging of the population, Japan introduced public long-term care insurance to reinforce healthcare services for the elderly in 2000. Precisely predicting future demand for long-term care services helps authorities to plan and manage their healthcare resources and citizens to prevent their health status deterioration.
This paper presents our novel study for developing an effective model to predict individual-level future long-term care demand using previous healthcare insurance claims data. We designed two discriminative models and subsequently trained and validated the models using three learning algorithms with medical and long-term care insurance claims and enrollment records, which were provided by 170 regional public insurers in Gifu, Japan.
The prediction model based on multiclass classification and gradient-boosting decision tree achieved practically high accuracy (weighted average of Precision, 0.872; Recall, 0.878; and F-measure, 0.873) for up to 12 months after the previous claims. The top important feature variables were indicators of current health status (e.g., current eligibility levels and age), risk factors to worsen future healthcare status (e.g., dementia), and preventive care services for improving future healthcare status (e.g., training and rehabilitation).
The intensive validation tests have indicated that the developed prediction method holds high robustness, even though it yields relatively lower accuracy for specific patient groups with health conditions that are hard to distinguish.
在人口快速老龄化的推动下,日本于2000年引入了公共长期护理保险,以加强针对老年人的医疗服务。准确预测长期护理服务的未来需求有助于当局规划和管理其医疗资源,并帮助公民预防健康状况恶化。
本文介绍了我们的一项新研究,该研究旨在利用以前的医疗保险理赔数据开发一种有效的模型,以预测个人层面未来的长期护理需求。我们设计了两种判别模型,随后使用三种学习算法,利用日本岐阜县170家地区公共保险公司提供的医疗和长期护理保险理赔及参保记录对模型进行了训练和验证。
基于多类分类和梯度提升决策树的预测模型在前次理赔后的12个月内实现了较高的实际准确率(精确率加权平均值为0.872;召回率为0.878;F值为0.873)。最重要的特征变量是当前健康状况指标(如当前资格水平和年龄)、未来健康状况恶化的风险因素(如痴呆症)以及改善未来健康状况的预防性护理服务(如培训和康复)。
深入的验证测试表明,所开发的预测方法具有很高的稳健性,尽管对于健康状况难以区分的特定患者群体,其准确率相对较低。