Stollenwerk Björn, Welchowski Thomas, Vogl Matthias, Stock Stephanie
Institute of Health Economics and Health Care Management, Helmholtz Zentrum München (GmbH), Ingolstädter Landstraße 1, 85764, Neuherberg, Germany.
Institut für Medizinische Biometrie, Informatik und Epidemiologie (IMBIE), Universitätsklinikum Bonn, Sigmund-Freud-Straße 25, 53105, Bonn, Germany.
Eur J Health Econ. 2016 Apr;17(3):235-44. doi: 10.1007/s10198-015-0667-z. Epub 2015 Feb 4.
Despite the increasing availability of routine data, no analysis method has yet been presented for cost-of-illness (COI) studies based on massive data. We aim, first, to present such a method and, second, to assess the relevance of the associated gain in numerical efficiency. We propose a prevalence-based, top-down regression approach consisting of five steps: aggregating the data; fitting a generalized additive model (GAM); predicting costs via the fitted GAM; comparing predicted costs between prevalent and non-prevalent subjects; and quantifying the stochastic uncertainty via error propagation. To demonstrate the method, it was applied to aggregated data in the context of chronic lung disease to German sickness funds data (from 1999), covering over 7.3 million insured. To assess the gain in numerical efficiency, the computational time of the innovative approach has been compared with corresponding GAMs applied to simulated individual-level data. Furthermore, the probability of model failure was modeled via logistic regression. Applying the innovative method was reasonably fast (19 min). In contrast, regarding patient-level data, computational time increased disproportionately by sample size. Furthermore, using patient-level data was accompanied by a substantial risk of model failure (about 80 % for 6 million subjects). The gain in computational efficiency of the innovative COI method seems to be of practical relevance. Furthermore, it may yield more precise cost estimates.
尽管常规数据的可获取性日益提高,但尚未有针对基于海量数据的疾病成本(COI)研究的分析方法。我们的目标,一是提出这样一种方法,二是评估在数值效率方面相关提升的相关性。我们提出一种基于患病率的自上而下回归方法,该方法包括五个步骤:汇总数据;拟合广义相加模型(GAM);通过拟合的GAM预测成本;比较患病和未患病个体的预测成本;以及通过误差传播量化随机不确定性。为了演示该方法,将其应用于慢性肺病背景下的汇总数据,即德国疾病基金数据(来自1999年),涵盖超过730万参保人。为了评估数值效率的提升,将创新方法的计算时间与应用于模拟个体层面数据的相应GAM进行了比较。此外,通过逻辑回归对模型失败的概率进行了建模。应用创新方法相当快(19分钟)。相比之下,对于患者层面的数据,计算时间随样本量不成比例地增加。此外,使用患者层面的数据伴随着模型失败的重大风险(600万受试者时约为80%)。创新的COI方法在计算效率方面的提升似乎具有实际意义。此外,它可能会产生更精确的成本估计。