Suppr超能文献

使用集成重要性抽样估计疾病映射模型的交叉验证预测p值。

Estimating cross-validatory predictive p-values with integrated importance sampling for disease mapping models.

作者信息

Li Longhai, Feng Cindy X, Qiu Shi

机构信息

Department of Mathematics and Statistics, University of Saskatchewan, 106 Wiggins Rd, Saskatoon, S7N5E6, SK, Canada.

School of Public Health, University of Saskatchewan, 104 Clinic Place, Saskatoon, S7N5E5, SK, Canada.

出版信息

Stat Med. 2017 Jun 30;36(14):2220-2236. doi: 10.1002/sim.7278. Epub 2017 Mar 12.

Abstract

An important statistical task in disease mapping problems is to identify divergent regions with unusually high or low risk of disease. Leave-one-out cross-validatory (LOOCV) model assessment is the gold standard for estimating predictive p-values that can flag such divergent regions. However, actual LOOCV is time-consuming because one needs to rerun a Markov chain Monte Carlo analysis for each posterior distribution in which an observation is held out as a test case. This paper introduces a new method, called integrated importance sampling (iIS), for estimating LOOCV predictive p-values with only Markov chain samples drawn from the posterior based on a full data set. The key step in iIS is that we integrate away the latent variables associated the test observation with respect to their conditional distribution without reference to the actual observation. By following the general theory for importance sampling, the formula used by iIS can be proved to be equivalent to the LOOCV predictive p-value. We compare iIS and other three existing methods in the literature with two disease mapping datasets. Our empirical results show that the predictive p-values estimated with iIS are almost identical to the predictive p-values estimated with actual LOOCV and outperform those given by the existing three methods, namely, the posterior predictive checking, the ordinary importance sampling, and the ghosting method by Marshall and Spiegelhalter (2003). Copyright © 2017 John Wiley & Sons, Ltd.

摘要

在疾病地图绘制问题中,一项重要的统计任务是识别疾病风险异常高或低的差异区域。留一法交叉验证(LOOCV)模型评估是估计可标记此类差异区域的预测p值的金标准。然而,实际的LOOCV很耗时,因为对于每个将一个观测值作为测试案例留出的后验分布,都需要重新运行马尔可夫链蒙特卡罗分析。本文介绍了一种名为集成重要性抽样(iIS)的新方法,用于仅基于完整数据集从后验中抽取的马尔可夫链样本估计LOOCV预测p值。iIS的关键步骤是,我们针对与测试观测值相关的潜在变量,根据其条件分布进行积分,而无需参考实际观测值。根据重要性抽样的一般理论,可以证明iIS使用的公式与LOOCV预测p值等价。我们使用两个疾病地图绘制数据集将iIS与文献中的其他三种现有方法进行了比较。我们的实证结果表明,用iIS估计的预测p值与用实际LOOCV估计的预测p值几乎相同,并且优于现有三种方法给出的预测p值,即后验预测检验、普通重要性抽样以及Marshall和Spiegelhalter(2003)提出的重影法。版权所有© 2017约翰威立父子有限公司。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验