• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

相似文献

1
Online updating method with new variables for big data streams.面向大数据流的含新变量的在线更新方法。
Can J Stat. 2018 Mar;46(1):123-146. doi: 10.1002/cjs.11330. Epub 2017 Aug 9.
2
Online Updating of Statistical Inference in the Big Data Setting.大数据环境下统计推断的在线更新
Technometrics. 2016;58(3):393-403. doi: 10.1080/00401706.2016.1142900. Epub 2016 Jul 8.
3
Statistical methods and computing for big data.大数据的统计方法与计算
Stat Interface. 2016;9(4):399-414. doi: 10.4310/SII.2016.v9.n4.a1.
4
Fast forward selection for generalized estimating equations with a large number of predictor variables.具有大量预测变量的广义估计方程的快速向前选择。
Biometrics. 2014 Mar;70(1):110-20. doi: 10.1111/biom.12118. Epub 2013 Dec 18.
5
Translational Metabolomics of Head Injury: Exploring Dysfunctional Cerebral Metabolism with Ex Vivo NMR Spectroscopy-Based Metabolite Quantification头部损伤的转化代谢组学:基于体外核磁共振波谱的代谢物定量分析探索脑代谢功能障碍
6
Bayesian adjustment for covariate measurement errors: a flexible parametric approach.协变量测量误差的贝叶斯调整:一种灵活的参数方法。
Stat Med. 2009 May 15;28(11):1580-600. doi: 10.1002/sim.3552.
7
Impact of the 1990 Hong Kong legislation for restriction on sulfur content in fuel.1990年香港燃料含硫量限制立法的影响。
Res Rep Health Eff Inst. 2012 Aug(170):5-91.
8
A closed testing procedure to select an appropriate method for updating prediction models.一种用于选择更新预测模型合适方法的封闭测试程序。
Stat Med. 2017 Dec 10;36(28):4529-4539. doi: 10.1002/sim.7179. Epub 2016 Nov 28.
9
On the analysis of composite measures of quality in medical research.医学研究中质量综合指标的分析
Stat Methods Med Res. 2017 Apr;26(2):633-660. doi: 10.1177/0962280214553330. Epub 2014 Oct 8.
10
A weighted estimating equation for linear regression with missing covariate data.具有缺失协变量数据的线性回归的加权估计方程。
Stat Med. 2002 Aug 30;21(16):2421-36. doi: 10.1002/sim.1195.

引用本文的文献

1
Online Updating of Survival Analysis.生存分析的在线更新
J Comput Graph Stat. 2021;30(4):1209-1223. doi: 10.1080/10618600.2020.1870481. Epub 2021 Mar 8.

本文引用的文献

1
Online Updating of Statistical Inference in the Big Data Setting.大数据环境下统计推断的在线更新
Technometrics. 2016;58(3):393-403. doi: 10.1080/00401706.2016.1142900. Epub 2016 Jul 8.
2
Statistical methods and computing for big data.大数据的统计方法与计算
Stat Interface. 2016;9(4):399-414. doi: 10.4310/SII.2016.v9.n4.a1.
3
Multivariate Meta-Analysis of Heterogeneous Studies Using Only Summary Statistics: Efficiency and Robustness.仅使用汇总统计量对异质性研究进行多变量Meta分析:效率与稳健性
J Am Stat Assoc. 2015;110(509):326-340. doi: 10.1080/01621459.2014.899235.
4
Systems biology and new technologies enable predictive and preventative medicine.系统生物学和新技术推动了预测性和预防性医学的发展。
Science. 2004 Oct 22;306(5696):640-3. doi: 10.1126/science.1104635.

面向大数据流的含新变量的在线更新方法。

Online updating method with new variables for big data streams.

作者信息

Wang Chun, Chen Ming-Hui, Wu Jing, Yan Jun, Zhang Yuping, Schifano Elizabeth

机构信息

Liberty Mutual Insurance, Boston, MA, USA.

Department of Statistics, University of Connecticut, Storrs, CT, USA.

出版信息

Can J Stat. 2018 Mar;46(1):123-146. doi: 10.1002/cjs.11330. Epub 2017 Aug 9.

DOI:10.1002/cjs.11330
PMID:29662263
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5898930/
Abstract

For big data arriving in streams, online updating is an important statistical method that breaks the storage barrier and the computational barrier under certain circumstances. In the regression context, online updating algorithms assume that the set of predictor variables does not change, and consequently cannot incorporate new variables that may become available midway through the data stream. A naive approach would be to discard all previous information and start updating with new variables from scratch. We propose a method that utilizes the information from earlier data in the online updating algorithm with bias corrections to improve efficiency. The method is developed for linear models first, and then extended to estimating equations for generalized linear models. Closed-form expressions for the efficiency gain over the naive approach are derived in a particular linear model setting. We compare the performance of our proposed bias-correcting approach and the naive approach in simulation studies with data generated from a normal linear model and a logistic regression model. The method is applied to a study on airline delay, where reasons for delays were only available more recently, starting in 2003.

摘要

对于以流形式到达的大数据,在线更新是一种重要的统计方法,在某些情况下它打破了存储障碍和计算障碍。在回归背景下,在线更新算法假定预测变量集不变,因此无法纳入可能在数据流中途变得可用的新变量。一种简单的方法是丢弃所有先前的信息,然后从头开始用新变量进行更新。我们提出一种方法,该方法在在线更新算法中利用早期数据的信息并进行偏差校正以提高效率。该方法首先针对线性模型开发,然后扩展到广义线性模型的估计方程。在特定的线性模型设置中,推导了相对于简单方法效率提升的闭式表达式。在模拟研究中,我们将所提出的偏差校正方法和简单方法的性能与从正态线性模型和逻辑回归模型生成的数据进行了比较。该方法应用于一项关于航班延误的研究,在该研究中,延误原因直到2003年才开始有更多数据可用。