Suppr超能文献

大数据环境下统计推断的在线更新

Online Updating of Statistical Inference in the Big Data Setting.

作者信息

Schifano Elizabeth D, Wu Jing, Wang Chun, Yan Jun, Chen Ming-Hui

机构信息

Department of Statistics, University of Connecticut.

出版信息

Technometrics. 2016;58(3):393-403. doi: 10.1080/00401706.2016.1142900. Epub 2016 Jul 8.

Abstract

We present statistical methods for big data arising from online analytical processing, where large amounts of data arrive in streams and require fast analysis without storage/access to the historical data. In particular, we develop iterative estimating algorithms and statistical inferences for linear models and estimating equations that update as new data arrive. These algorithms are computationally efficient, minimally storage-intensive, and allow for possible rank deficiencies in the subset design matrices due to rare-event covariates. Within the linear model setting, the proposed online-updating framework leads to predictive residual tests that can be used to assess the goodness-of-fit of the hypothesized model. We also propose a new online-updating estimator under the estimating equation setting. Theoretical properties of the goodness-of-fit tests and proposed estimators are examined in detail. In simulation studies and real data applications, our estimator compares favorably with competing approaches under the estimating equation setting.

摘要

我们提出了适用于在线分析处理产生的大数据的统计方法,其中大量数据以流的形式到达,并且需要在不存储/访问历史数据的情况下进行快速分析。特别是,我们针对线性模型和估计方程开发了迭代估计算法和统计推断,这些算法会随着新数据的到达而更新。这些算法计算效率高,存储需求极小,并且由于罕见事件协变量,允许子集设计矩阵中可能存在秩亏缺。在线性模型设置中,所提出的在线更新框架会产生预测残差检验,可用于评估假设模型的拟合优度。我们还在估计方程设置下提出了一种新的在线更新估计器。详细研究了拟合优度检验和所提出估计器的理论性质。在模拟研究和实际数据应用中,我们的估计器在估计方程设置下与竞争方法相比具有优势。

相似文献

1
Online Updating of Statistical Inference in the Big Data Setting.大数据环境下统计推断的在线更新
Technometrics. 2016;58(3):393-403. doi: 10.1080/00401706.2016.1142900. Epub 2016 Jul 8.
3
Online Updating of Survival Analysis.生存分析的在线更新
J Comput Graph Stat. 2021;30(4):1209-1223. doi: 10.1080/10618600.2020.1870481. Epub 2021 Mar 8.
5
Shrinkage estimators for covariance matrices.协方差矩阵的收缩估计量。
Biometrics. 2001 Dec;57(4):1173-84. doi: 10.1111/j.0006-341x.2001.01173.x.

引用本文的文献

3
Online Updating of Survival Analysis.生存分析的在线更新
J Comput Graph Stat. 2021;30(4):1209-1223. doi: 10.1080/10618600.2020.1870481. Epub 2021 Mar 8.
7
Statistical methods and computing for big data.大数据的统计方法与计算
Stat Interface. 2016;9(4):399-414. doi: 10.4310/SII.2016.v9.n4.a1.

本文引用的文献

1
Statistical methods and computing for big data.大数据的统计方法与计算
Stat Interface. 2016;9(4):399-414. doi: 10.4310/SII.2016.v9.n4.a1.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验