Hansen Tonya Moen, Stavem Knut, Rand Kim
Division for Health Services, Norwegian Institute of Public Health, Oslo, Norway.
Health Services Research Unit, Akershus University Hospital, Norway.
MDM Policy Pract. 2022 Mar 7;7(1):23814683221083839. doi: 10.1177/23814683221083839. eCollection 2022 Jan-Jun.
National valuation studies are costly, with ∼1000 face-to-face interviews recommended, and some countries may deem such studies infeasible. Building on previous studies exploring sample size, we determined the effect of sample size and alternative model specifications on prediction accuracy of modeled coefficients in EQ-5D-5L value set generating regression analyses. Data sets ( = 50 to ∼1000) were simulated from 3 valuation studies, resampled at the respondent level and randomly drawn 1000 times with replacement. We estimated utilities for each subsample with leave-one-out at the block level using regression models (8 or 20 parameter; with or without a random intercept; time tradeoff [TTO] data only or TTO + discrete choice experiment [DCE] data). Prediction accuracy, root mean square error (RMSE), was calculated by comparing to censored mean predicted values to the left-out block in the full data set. Linear regression was used to estimate the relative effect of changes in sample size and each model specification. Results showed that doubling the sample size decreased RMSE by on average 0.012. Effects of other model specifications were smaller but can when combined compensate for loss in prediction accuracy from a small sample size. For models using TTO data only, 8-parameter models clearly outperformed 20-parameter models. Adding a random intercept, or including DCE responses, also improved mean RMSE, most prominently for variants of the 20-parameter models. The prediction accuracy impact of further increases in sample size after 300 to 500 were smaller than the impact of combining alternative modeling choices. Hybrid modeling, use of constrained models, and inclusion of random intercepts all substantially improve the expected prediction accuracy. Beyond a minimum of 300 to 500 respondents, the sample size may be better informed by other considerations, such as legitimacy and representativeness, than by the technical prediction accuracy achievable.
Increases in sample size beyond a minimum in the range of 300 to 500 respondents provide smaller gains in expected prediction accuracy than alternative modeling approaches.Constrained, nonlinear models; time tradeoff + discrete choice experiment hybrid modeling; and including a random intercept all improved the prediction accuracy of models estimating values for the EQ-5D-5L based on data from 3 different valuation studies.The tested modeling choices can compensate for smaller sample sizes.
国家层面的估值研究成本高昂,建议进行约1000次面对面访谈,一些国家可能认为此类研究不可行。基于之前探索样本量的研究,我们在EQ - 5D - 5L价值集生成回归分析中,确定了样本量和替代模型规格对建模系数预测准确性的影响。从3项估值研究中模拟数据集(样本量从50到约1000),在受访者层面进行重采样,并进行1000次有放回的随机抽取。我们使用回归模型(8参数或20参数;有或无随机截距;仅时间权衡 [TTO] 数据或TTO + 离散选择实验 [DCE] 数据)在块层面使用留一法估计每个子样本的效用。通过将删失均值预测值与完整数据集中留出的块进行比较来计算预测准确性,即均方根误差(RMSE)。使用线性回归来估计样本量变化和每个模型规格的相对影响。结果表明,样本量翻倍平均使RMSE降低0.012。其他模型规格的影响较小,但组合起来可以弥补小样本量导致的预测准确性损失。对于仅使用TTO数据的模型,8参数模型明显优于20参数模型。添加随机截距或纳入DCE响应也改善了平均RMSE,对于20参数模型的变体最为显著。样本量在300至500之后进一步增加对预测准确性的影响小于组合替代建模选择的影响。混合建模、使用约束模型以及纳入随机截距都能显著提高预期预测准确性。超过至少300至500名受访者后,样本量可能更多地由其他因素(如合理性和代表性)决定,而非可实现的技术预测准确性。
样本量超过至少300至500名受访者的最低范围时,与替代建模方法相比,预期预测准确性的提升较小。约束的非线性模型、时间权衡 + 离散选择实验混合建模以及纳入随机截距均提高了基于3项不同估值研究数据估计EQ - 5D - 5L值的模型的预测准确性。所测试的建模选择可以弥补较小的样本量。