Willis Brian H, Riley Richard D
Institute of Applied Health Research, University of Birmingham, U.K.
Research Institute for Primary Care and Health Sciences, Keele University, U.K.
Stat Med. 2017 Sep 20;36(21):3283-3301. doi: 10.1002/sim.7372. Epub 2017 Jun 15.
An important question for clinicians appraising a meta-analysis is: are the findings likely to be valid in their own practice-does the reported effect accurately represent the effect that would occur in their own clinical population? To this end we advance the concept of statistical validity-where the parameter being estimated equals the corresponding parameter for a new independent study. Using a simple ('leave-one-out') cross-validation technique, we demonstrate how we may test meta-analysis estimates for statistical validity using a new validation statistic, Vn, and derive its distribution. We compare this with the usual approach of investigating heterogeneity in meta-analyses and demonstrate the link between statistical validity and homogeneity. Using a simulation study, the properties of Vn and the Q statistic are compared for univariate random effects meta-analysis and a tailored meta-regression model, where information from the setting (included as model covariates) is used to calibrate the summary estimate to the setting of application. Their properties are found to be similar when there are 50 studies or more, but for fewer studies Vn has greater power but a higher type 1 error rate than Q. The power and type 1 error rate of Vn are also shown to depend on the within-study variance, between-study variance, study sample size, and the number of studies in the meta-analysis. Finally, we apply Vn to two published meta-analyses and conclude that it usefully augments standard methods when deciding upon the likely validity of summary meta-analysis estimates in clinical practice. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.
对于评估荟萃分析的临床医生来说,一个重要的问题是:这些研究结果在他们自己的临床实践中可能有效吗?报告的效应是否准确代表了在他们自己的临床人群中会出现的效应?为此,我们提出了统计效度的概念,即所估计的参数等于一项新的独立研究的相应参数。使用一种简单的(“留一法”)交叉验证技术,我们展示了如何使用新的验证统计量Vn来检验荟萃分析估计的统计效度,并推导其分布。我们将此与荟萃分析中研究异质性的常用方法进行比较,并证明统计效度与同质性之间的联系。通过一项模拟研究,比较了单变量随机效应荟萃分析和定制的荟萃回归模型中Vn和Q统计量的性质,其中来自研究背景(作为模型协变量纳入)的信息用于将汇总估计校准到应用背景。当有50项或更多研究时,发现它们的性质相似,但对于较少的研究,Vn比Q具有更大的功效,但I型错误率更高。Vn的功效和I型错误率还显示取决于研究内方差、研究间方差、研究样本量以及荟萃分析中的研究数量。最后,我们将Vn应用于两项已发表的荟萃分析,并得出结论,在确定临床实践中荟萃分析汇总估计的可能效度时,它有效地增强了标准方法。© 2017作者。《医学统计学》由约翰·威利父子有限公司出版。