• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

多重输出:通过对独立数据的分析求平均值来推断复杂聚类数据。

Multiple outputation: inference for complex clustered data by averaging analyses from independent data.

作者信息

Follmann Dean, Proschan Michael, Leifer Eric

机构信息

National Institute of Allergy and Infectious Diseases, 6700B Rockledge Drive MSC 7609, Bethesda, Maryland 20892, USA.

出版信息

Biometrics. 2003 Jun;59(2):420-9. doi: 10.1111/1541-0420.00049.

DOI:10.1111/1541-0420.00049
PMID:12926727
Abstract

This article applies a simple method for settings where one has clustered data, but statistical methods are only available for independent data. We assume the statistical method provides us with a normally distributed estimate, theta, and an estimate of its variance sigma. We randomly select a data point from each cluster and apply our statistical method to this independent data. We repeat this multiple times, and use the average of the associated theta's as our estimate. An estimate of the variance is given by the average of the sigma2's minus the sample variance of the theta's. We call this procedure multiple outputation, as all "excess" data within each cluster is thrown out multiple times. Hoffman, Sen, and Weinberg (2001, Biometrika 88, 1121-1134) introduced this approach for generalized linear models when the cluster size is related to outcome. In this article, we demonstrate the broad applicability of the approach. Applications to angular data, p-values, vector parameters, Bayesian inference, genetics data, and random cluster sizes are discussed. In addition, asymptotic normality of estimates based on all possible outputations, as well as a finite number of outputations, is proven given weak conditions. Multiple outputation provides a simple and broadly applicable method for analyzing clustered data. It is especially suited to settings where methods for clustered data are impractical, but can also be applied generally as a quick and simple tool.

摘要

本文应用了一种简单的方法,适用于存在聚类数据但统计方法仅适用于独立数据的情况。我们假设统计方法为我们提供了一个正态分布的估计值θ及其方差σ的估计值。我们从每个聚类中随机选择一个数据点,并将我们的统计方法应用于这些独立数据。我们重复此操作多次,并使用相关θ值的平均值作为我们的估计值。方差的估计值由σ²的平均值减去θ值的样本方差给出。我们将此过程称为多次输出法,因为每个聚类中的所有“多余”数据都被多次舍弃。霍夫曼、森和温伯格(2001年,《生物统计学》88卷,第1121 - 1134页)在聚类大小与结果相关时,针对广义线性模型引入了这种方法。在本文中,我们展示了该方法的广泛适用性。讨论了其在角度数据、p值、向量参数、贝叶斯推断、遗传学数据以及随机聚类大小方面的应用。此外,在弱条件下,证明了基于所有可能输出法以及有限次输出法的估计值的渐近正态性。多次输出法为分析聚类数据提供了一种简单且广泛适用的方法。它特别适用于聚类数据方法不实用的情况,但也可普遍用作一种快速简单的工具。

相似文献

1
Multiple outputation: inference for complex clustered data by averaging analyses from independent data.多重输出:通过对独立数据的分析求平均值来推断复杂聚类数据。
Biometrics. 2003 Jun;59(2):420-9. doi: 10.1111/1541-0420.00049.
2
Within-Cluster Resampling for Analysis of Family Data: Ready for Prime-Time?用于家族数据分析的聚类内重采样:准备好进入黄金时段了吗?
Stat Interface. 2010 Apr 1;3(2):169-176. doi: 10.4310/sii.2010.v3.n2.a4.
3
Marginal analyses of clustered data when cluster size is informative.当聚类大小具有信息性时对聚类数据的边际分析。
Biometrics. 2003 Mar;59(1):36-42. doi: 10.1111/1541-0420.00005.
4
Part 1. Statistical Learning Methods for the Effects of Multiple Air Pollution Constituents.第1部分. 多种空气污染成分影响的统计学习方法
Res Rep Health Eff Inst. 2015 Jun(183 Pt 1-2):5-50.
5
[Meta-analysis of the Italian studies on short-term effects of air pollution].[意大利关于空气污染短期影响研究的荟萃分析]
Epidemiol Prev. 2001 Mar-Apr;25(2 Suppl):1-71.
6
Multiple outputation for the analysis of longitudinal data subject to irregular observation.用于对受不规则观测影响的纵向数据进行分析的多重输出
Stat Med. 2016 May 20;35(11):1800-18. doi: 10.1002/sim.6829. Epub 2015 Dec 13.
7
Flexible Bayesian quantile regression for independent and clustered data.灵活的贝叶斯分位数回归用于独立和聚类数据。
Biostatistics. 2010 Apr;11(2):337-52. doi: 10.1093/biostatistics/kxp049. Epub 2009 Nov 30.
8
Incorporating correlation for multivariate failure time data when cluster size is large.当聚类规模较大时,对多变量失效时间数据纳入相关性。
Biometrics. 2010 Jun;66(2):393-404. doi: 10.1111/j.1541-0420.2009.01307.x. Epub 2009 Aug 10.
9
A Bayesian approach for joint modeling of cluster size and subunit-specific outcomes.一种用于聚类大小和亚单位特异性结果联合建模的贝叶斯方法。
Biometrics. 2003 Sep;59(3):521-30. doi: 10.1111/1541-0420.00062.
10
Multilevel modelling of clustered grouped survival data using Cox regression model: an application to ART dental restorations.使用Cox回归模型对聚类分组生存数据进行多水平建模:在抗逆转录病毒治疗牙齿修复中的应用
Stat Med. 2006 Feb 15;25(3):447-57. doi: 10.1002/sim.2235.

引用本文的文献

1
Global and Episode-Specific Prediction of Recurrent Events Using Longitudinal Health Informatics Data.使用纵向健康信息学数据对复发事件进行全局和特定事件预测。
J Am Stat Assoc. 2025 Jul 3. doi: 10.1080/01621459.2025.2497569.
2
Variable selection in modelling clustered data via within-cluster resampling.通过聚类内重采样对聚类数据进行建模时的变量选择。
Can J Stat. 2025 Mar;53(1). doi: 10.1002/cjs.11824. Epub 2024 Aug 1.
3
Population Structure and Antimicrobial Resistance in Campylobacter jejuni and C. coli Isolated from Humans with Diarrhea and from Poultry, East Africa.
东非腹泻人类患者和禽类中分离的空肠弯曲菌和结肠弯曲菌的种群结构和耐药性。
Emerg Infect Dis. 2024 Oct;30(10):2079-2089. doi: 10.3201/eid3010.231399.
4
Association of Remote Patient-Reported Outcomes and Step Counts With Hospitalization or Death Among Patients With Advanced Cancer Undergoing Chemotherapy: Secondary Analysis of the PROStep Randomized Trial.接受化疗的晚期癌症患者远程报告结局和步数与住院或死亡的相关性:PROStep 随机试验的二次分析。
J Med Internet Res. 2024 May 17;26:e51059. doi: 10.2196/51059.
5
Effect of pre-operative warm-up on trainee intraoperative performance during robot-assisted hysterectomy: a randomized controlled trial.术前热身对机器人辅助子宫切除术实习医生术中表现的影响:一项随机对照试验。
Int Urogynecol J. 2023 Nov;34(11):2751-2758. doi: 10.1007/s00192-023-05595-1. Epub 2023 Jul 14.
6
Reappraisal of Idiopathic CD4 Lymphocytopenia at 30 Years.30 年后对特发性 CD4 淋巴细胞减少症的再评估。
N Engl J Med. 2023 May 4;388(18):1680-1691. doi: 10.1056/NEJMoa2202348.
7
Randomized Trials With Repeatedly Measured Outcomes: Handling Irregular and Potentially Informative Assessment Times.随机临床试验与重复测量结果:处理不规则且可能具有信息性的评估时间。
Epidemiol Rev. 2022 Dec 21;44(1):121-137. doi: 10.1093/epirev/mxac010.
8
Alert-Triggered Patient Education Versus Nurse Feedback for Nonadministered Venous Thromboembolism Prophylaxis Doses: A Cluster-Randomized Controlled Trial.基于警示触发的患者教育与护士反馈在预防静脉血栓栓塞症非医嘱用药中的效果比较:一项整群随机对照试验
J Am Heart Assoc. 2022 Sep 20;11(18):e027119. doi: 10.1161/JAHA.122.027119. Epub 2022 Sep 1.
9
Trajectories of Visual and Vestibular Markers of Youth Concussion.青少年脑震荡的视觉和前庭指标轨迹。
J Neurotrauma. 2022 Oct;39(19-20):1382-1390. doi: 10.1089/neu.2022.0014.
10
Causal inference methods for vaccine sieve analysis with effect modification.疫苗筛选分析中具有效应修饰的因果推断方法。
Stat Med. 2022 Apr 15;41(8):1513-1524. doi: 10.1002/sim.9302. Epub 2022 Jan 19.