Division Farm Animal Health, Department of Population Health Sciences, Utrecht University, TD Utrecht, The Netherlands.
Faculty of Social and Behavioral Sciences, Department of Methodology and Statistics, Utrecht University, TC Utrecht, The Netherlands.
PLoS One. 2021 Jan 14;16(1):e0244752. doi: 10.1371/journal.pone.0244752. eCollection 2021.
Random effects regression models are routinely used for clustered data in etiological and intervention research. However, in prediction models, the random effects are either neglected or conventionally substituted with zero for new clusters after model development. In this study, we applied a Bayesian prediction modelling method to the subclinical ketosis data previously collected by Van der Drift et al. (2012). Using a dataset of 118 randomly selected Dutch dairy farms participating in a regular milk recording system, the authors proposed a prediction model with milk measures as well as available test-day information as predictors for the diagnosis of subclinical ketosis in dairy cows. While their original model included random effects to correct for the clustering, the random effect term was removed for their final prediction model. With the Bayesian prediction modelling approach, we first used non-informative priors for the random effects for model development as well as for prediction. This approach was evaluated by comparing it to the original frequentist model. In addition, herd level expert opinion was elicited from a bovine health specialist using three different scales of precision and incorporated in the prediction as informative priors for the random effects, resulting in three more Bayesian prediction models. Results showed that the Bayesian approach could naturally take the clustering structure of clusters into account by keeping the random effects in the prediction model. Expert opinion could be explicitly combined with individual level data for prediction. However in this dataset, when elicited expert opinion was incorporated, little improvement was seen at the individual level as well as at the herd level. When the prediction models were applied to the 118 herds, at the individual cow level, with the original frequentist approach we obtained a sensitivity of 82.4% and a specificity of 83.8% at the optimal cutoff, while with the three Bayesian models with elicited expert opinion, we obtained sensitivities ranged from 78.7% to 84.6% and specificities ranged from 75.0% to 83.6%. At the herd level, 30 out of 118 within herd prevalences were correctly predicted by the original frequentist approach, and 31 to 44 herds were correctly predicted by the three Bayesian models with elicited expert opinion. Further investigation in expert opinion and distributional assumption for the random effects was carried out and discussed.
随机效应回归模型通常用于病因学和干预研究中的聚类数据。然而,在预测模型中,要么忽略随机效应,要么在模型开发后将新聚类的随机效应传统地替换为零。在这项研究中,我们应用贝叶斯预测建模方法对 Van der Drift 等人(2012 年)之前收集的亚临床酮病数据进行了分析。使用一个由 118 个随机选择的参与常规牛奶记录系统的荷兰奶牛场组成的数据集,作者提出了一个预测模型,该模型使用牛奶测量值以及可用的测试日信息作为预测奶牛亚临床酮病的指标。虽然他们的原始模型包括随机效应来校正聚类,但在最终的预测模型中去除了随机效应项。使用贝叶斯预测建模方法,我们首先为模型开发和预测使用非信息性先验概率对随机效应进行了处理。通过将其与原始频率模型进行比较,评估了该方法。此外,还通过三位牛健康专家使用三种不同精度的规模征求了群体水平的专家意见,并将其作为随机效应的信息性先验概率纳入预测,从而得出了三个更具贝叶斯预测模型。结果表明,贝叶斯方法可以通过在预测模型中保留随机效应,自然考虑到聚类的聚类结构。专家意见可以明确地与个体水平数据结合用于预测。然而,在这个数据集中,当纳入专家意见时,个体水平和群体水平都没有看到明显的改善。当将预测模型应用于 118 个农场时,在个体牛水平上,使用原始频率方法,我们在最佳截点处获得了 82.4%的灵敏度和 83.8%的特异性,而使用具有专家意见的三个贝叶斯模型,我们获得了 78.7%至 84.6%的灵敏度和 75.0%至 83.6%的特异性。在群体水平上,原始频率方法正确预测了 118 个群体内患病率中的 30 个,而具有专家意见的三个贝叶斯模型正确预测了 31 到 44 个群体。对专家意见和随机效应的分布假设进行了进一步的研究和讨论。