Suppr超能文献

多元尾部概率:预测华盛顿州的地区百日咳病例

Multivariate Tail Probabilities: Predicting Regional Pertussis Cases in Washington State.

作者信息

Zhang Xuze, Pyne Saumyadipta, Kedem Benjamin

机构信息

Department of Mathematics and Institute for Systems Research, University of Maryland, College Park, MD 20742, USA.

Public Health Dynamics Laboratory, Department of Biostatistics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA 15261, USA.

出版信息

Entropy (Basel). 2021 May 27;23(6):675. doi: 10.3390/e23060675.

Abstract

In disease modeling, a key statistical problem is the estimation of lower and upper tail probabilities of health events from given data sets of small size and limited range. Assuming such constraints, we describe a computational framework for the systematic fusion of observations from multiple sources to compute tail probabilities that could not be obtained otherwise due to a lack of lower or upper tail data. The estimation of multivariate lower and upper tail probabilities from a given small reference data set that lacks complete information about such tail data is addressed in terms of pertussis case count data. Fusion of data from multiple sources in conjunction with the density ratio model is used to give probability estimates that are non-obtainable from the empirical distribution. Based on a density ratio model with variable tilts, we first present a univariate fit and, subsequently, improve it with a multivariate extension. In the multivariate analysis, we selected the best model in terms of the Akaike Information Criterion (AIC). Regional prediction, in Washington state, of the number of pertussis cases is approached by providing joint probabilities using fused data from several relatively small samples following the selected density ratio model. The model is validated by a graphical goodness-of-fit plot comparing the estimated reference distribution obtained from the fused data with that of the empirical distribution obtained from the reference sample only.

摘要

在疾病建模中,一个关键的统计问题是根据小规模且范围有限的给定数据集来估计健康事件的下尾概率和上尾概率。在这种约束条件下,我们描述了一个计算框架,用于系统地融合来自多个来源的观测数据,以计算由于缺乏下尾或上尾数据而无法通过其他方式获得的尾概率。针对百日咳病例计数数据,探讨了如何从给定的缺乏此类尾数据完整信息的小参考数据集中估计多元下尾概率和上尾概率。结合密度比模型融合来自多个来源的数据,以给出从经验分布中无法获得的概率估计。基于具有可变倾斜度的密度比模型,我们首先进行单变量拟合,随后通过多元扩展对其进行改进。在多元分析中,我们根据赤池信息准则(AIC)选择了最佳模型。通过遵循选定的密度比模型,利用来自几个相对较小样本的融合数据提供联合概率,来对华盛顿州百日咳病例数进行区域预测。通过绘制拟合优度图来验证该模型,该图将从融合数据中获得的估计参考分布与仅从参考样本中获得的经验分布进行比较。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8bcc/8226468/b59831e4b50c/entropy-23-00675-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验