Department of Statistical Science, School of Multidisciplinary Sciences, The Graduate University for Advanced Studies, Tokyo, Japan.
Diagnostics Department, Asahi Kasei Pharma Corporation, Tokyo, Japan.
Biom J. 2021 Feb;63(2):394-405. doi: 10.1002/bimj.201900351. Epub 2020 Nov 9.
The prediction interval has been increasingly used in meta-analyses as a useful measure for assessing the magnitude of treatment effect and between-studies heterogeneity. In calculations of the prediction interval, although the Higgins-Thompson-Spiegelhalter method is used most often in practice, it might not have adequate coverage probability for the true treatment effect of a future study under realistic situations. An effective alternative candidate is the Bayesian prediction interval, which has also been widely used in general prediction problems. However, these prediction intervals are constructed based on the Bayesian philosophy, and their frequentist validities are only justified by large-sample approximations even if noninformative priors are adopted. There has been no certain evidence that evaluated their frequentist performances under realistic situations of meta-analyses. In this study, we conducted extensive simulation studies to assess the frequentist coverage performances of Bayesian prediction intervals with 11 noninformative prior distributions under general meta-analysis settings. Through these simulation studies, we found that frequentist coverage performances strongly depended on what prior distributions were adopted. In addition, when the number of studies was smaller than 10, there were no prior distributions that retained accurate frequentist coverage properties. We also illustrated these methods via applications to two real meta-analysis datasets. The resultant prediction intervals also differed according to the adopted prior distributions. Inaccurate prediction intervals may provide invalid evidence and misleading conclusions. Thus, if frequentist accuracy is required, Bayesian prediction intervals should be used cautiously in practice.
预测区间在荟萃分析中越来越多地被用作评估治疗效果和研究间异质性大小的有用指标。在预测区间的计算中,虽然 Higgins-Thompson-Spiegelhalter 方法在实践中最常用,但它在实际情况下对未来研究的真实治疗效果可能没有足够的覆盖概率。一个有效的替代候选者是贝叶斯预测区间,它也在一般预测问题中得到了广泛应用。然而,这些预测区间是基于贝叶斯哲学构建的,即使采用非信息先验,它们的频率有效性也仅通过大样本逼近来证明。没有确凿的证据表明它们在荟萃分析的实际情况下评估了其频率性能。在这项研究中,我们进行了广泛的模拟研究,以评估在一般荟萃分析设置下,11 种非信息先验分布下贝叶斯预测区间的频率覆盖性能。通过这些模拟研究,我们发现频率覆盖性能强烈依赖于采用的先验分布。此外,当研究数量小于 10 时,没有任何先验分布能保留准确的频率覆盖特性。我们还通过对两个真实的荟萃分析数据集的应用说明了这些方法。所得到的预测区间也因采用的先验分布而异。不准确的预测区间可能会提供无效的证据和误导性的结论。因此,如果需要频率准确性,在实践中应谨慎使用贝叶斯预测区间。