Suppr超能文献

对具有非常小簇的数据集进行层次结构建模:一项探索连续结果时簇比例影响的模拟研究。

Modelling the hierarchical structure in datasets with very small clusters: a simulation study to explore the effect of the proportion of clusters when the outcome is continuous.

机构信息

AG Epidemiologie & International Public Health, Universität Bielefeld, Bielefeld, Germany.

出版信息

Stat Med. 2013 Apr 15;32(8):1429-38. doi: 10.1002/sim.5638. Epub 2012 Oct 1.

Abstract

In cluster-randomised trials, the problem of non-independence within clusters is well known, and appropriate statistical analysis documented. Clusters typically seen in cluster trials are large in size and few in number, whereas datasets of preterm infants incorporate clusters of size two (twins), size three (triplets) and so on, with the majority of infants being in 'clusters' of size one. In such situations, it is unclear whether adjustment for clustering is needed or even possible. In this paper, we compared analyses allowing for clustering (linear mixed model) with analyses ignoring clustering (linear regression). Through simulations based on two real datasets, we explored estimation bias in predictors of a continuous outcome in different size datasets typical of preterm samples, with varying percentages of twins. Overall, the biases for estimated coefficients were similar for linear regression and mixed models, but the standard errors were consistently much less well estimated when using a linear model. Non-convergence was rare but was observed in approximately 5% of mixed models for samples below 200 and percentage of twins 2% or less. We conclude that in datasets with small clusters, mixed models should be the method of choice irrespective of the percentage of twins. If the mixed model does not converge, a linear regression can be fitted, but standard error will be underestimated, and so type I error may be inflated.

摘要

在整群随机试验中,群内非独立性问题众所周知,并且有相应的统计分析方法。整群试验中常见的群组通常数量少、规模大,而早产儿数据集则包含大小为 2(双胞胎)、3(三胞胎)等的群组,大多数婴儿的群组规模为 1。在这种情况下,尚不清楚是否需要或甚至是否可能进行聚类调整。在本文中,我们比较了允许聚类(线性混合模型)和忽略聚类(线性回归)的分析。通过基于两个真实数据集的模拟,我们探索了不同大小的早产儿样本数据集(双胞胎比例不同)中连续结果预测因子的估计偏差,这些数据集具有不同的大小。总体而言,线性回归和混合模型的估计系数的偏差相似,但当使用线性模型时,标准误差的估计始终差很多。虽然很少出现不收敛,但在样本量低于 200 且双胞胎比例低于 2%的混合模型中约有 5%观察到不收敛。我们的结论是,在小群组数据集中,无论双胞胎的比例如何,混合模型都应该是首选方法。如果混合模型不收敛,可以拟合线性回归,但标准误差会被低估,因此可能会导致Ⅰ型错误率增加。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验