• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

比较具有小聚类的数据集的分析方法:使用四个儿科数据集的案例研究

Comparing methods of analysing datasets with small clusters: case studies using four paediatric datasets.

作者信息

Marston Louise, Peacock Janet L, Yu Keming, Brocklehurst Peter, Calvert Sandra A, Greenough Anne, Marlow Neil

机构信息

Department of Primary Care and Population Health, Computing and Mathematics, Brunel University, London, UK.

出版信息

Paediatr Perinat Epidemiol. 2009 Jul;23(4):380-92. doi: 10.1111/j.1365-3016.2009.01046.x.

DOI:10.1111/j.1365-3016.2009.01046.x
PMID:19523085
Abstract

Studies of prematurely born infants contain a relatively large percentage of multiple births, so the resulting data have a hierarchical structure with small clusters of size 1, 2 or 3. Ignoring the clustering may lead to incorrect inferences. The aim of this study was to compare statistical methods which can be used to analyse such data: generalised estimating equations, multilevel models, multiple linear regression and logistic regression. Four datasets which differed in total size and in percentage of multiple births (n = 254, multiple 18%; n = 176, multiple 9%; n = 10 098, multiple 3%; n = 1585, multiple 8%) were analysed. With the continuous outcome, two-level models produced similar results in the larger dataset, while generalised least squares multilevel modelling (ML GLS 'xtreg' in Stata) and maximum likelihood multilevel modelling (ML MLE 'xtmixed' in Stata) produced divergent estimates using the smaller dataset. For the dichotomous outcome, most methods, except generalised least squares multilevel modelling (ML GH 'xtlogit' in Stata) gave similar odds ratios and 95% confidence intervals within datasets. For the continuous outcome, our results suggest using multilevel modelling. We conclude that generalised least squares multilevel modelling (ML GLS 'xtreg' in Stata) and maximum likelihood multilevel modelling (ML MLE 'xtmixed' in Stata) should be used with caution when the dataset is small. Where the outcome is dichotomous and there is a relatively large percentage of non-independent data, it is recommended that these are accounted for in analyses using logistic regression with adjusted standard errors or multilevel modelling. If, however, the dataset has a small percentage of clusters greater than size 1 (e.g. a population dataset of children where there are few multiples) there appears to be less need to adjust for clustering.

摘要

对早产婴儿的研究包含相对较大比例的多胞胎,因此所得数据具有层次结构,其中包含大小为1、2或3的小集群。忽略聚类可能会导致错误的推断。本研究的目的是比较可用于分析此类数据的统计方法:广义估计方程、多层模型、多元线性回归和逻辑回归。分析了四个数据集,它们在总规模和多胞胎百分比方面存在差异(n = 254,多胞胎占18%;n = 176,多胞胎占9%;n = 10098,多胞胎占3%;n = 1585,多胞胎占8%)。对于连续结果,在较大的数据集中,两级模型产生了相似的结果,而在较小的数据集中,广义最小二乘多层建模(Stata中的ML GLS 'xtreg')和最大似然多层建模(Stata中的ML MLE 'xtmixed')产生了不同的估计值。对于二分结果,除了广义最小二乘多层建模(Stata中的ML GH 'xtlogit')外,大多数方法在数据集中给出了相似的优势比和95%置信区间。对于连续结果,我们的结果建议使用多层建模。我们得出结论,当数据集较小时,应谨慎使用广义最小二乘多层建模(Stata中的ML GLS 'xtreg')和最大似然多层建模(Stata中的ML MLE 'xtmixed')。当结果是二分的且非独立数据的百分比相对较大时,建议在分析中使用调整标准误差的逻辑回归或多层建模来考虑这些因素。然而,如果数据集中大于1的集群百分比很小(例如,儿童总体数据集,其中多胞胎很少),似乎不太需要对聚类进行调整。

相似文献

1
Comparing methods of analysing datasets with small clusters: case studies using four paediatric datasets.比较具有小聚类的数据集的分析方法:使用四个儿科数据集的案例研究
Paediatr Perinat Epidemiol. 2009 Jul;23(4):380-92. doi: 10.1111/j.1365-3016.2009.01046.x.
2
Binomial outcomes in dataset with some clusters of size two: can the dependence of twins be accounted for? A simulation study comparing the reliability of statistical methods based on a dataset of preterm infants.具有一些大小为二的聚类的数据集中的二项式结果:双胞胎的相关性能否得到解释?一项基于早产儿数据集比较统计方法可靠性的模拟研究。
BMC Med Res Methodol. 2017 Jul 20;17(1):110. doi: 10.1186/s12874-017-0369-6.
3
Analysis of binary outcomes from randomised trials including multiple births: when should clustering be taken into account?随机试验中包括多胎妊娠的二分类结局分析:何时应考虑到聚类?
Paediatr Perinat Epidemiol. 2011 May;25(3):283-97. doi: 10.1111/j.1365-3016.2011.01196.x.
4
When can group level clustering be ignored? Multilevel models versus single-level models with sparse data.何时可以忽略群组层面的聚类?多层模型与具有稀疏数据的单层面模型的比较。
J Epidemiol Community Health. 2008 Aug;62(8):752-8. doi: 10.1136/jech.2007.060798.
5
Modelling the hierarchical structure in datasets with very small clusters: a simulation study to explore the effect of the proportion of clusters when the outcome is continuous.对具有非常小簇的数据集进行层次结构建模:一项探索连续结果时簇比例影响的模拟研究。
Stat Med. 2013 Apr 15;32(8):1429-38. doi: 10.1002/sim.5638. Epub 2012 Oct 1.
6
The application of multilevel, multivariate modelling to orthodontic research data.多级多变量建模在正畸研究数据中的应用。
Community Dent Health. 2000 Dec;17(4):236-42.
7
Multilevel modelling: Beyond the basic applications.多层模型:超越基础应用
Br J Math Stat Psychol. 2009 May;62(Pt 2):439-56. doi: 10.1348/000711008X327632. Epub 2008 Jul 28.
8
Simulation study of hierarchical regression.分层回归的模拟研究
Stat Med. 1996 Jun 15;15(11):1161-70. doi: 10.1002/(SICI)1097-0258(19960615)15:11<1161::AID-SIM221>3.0.CO;2-7.
9
Beyond logistic regression: structural equations modelling for binary variables and its application to investigating unobserved confounders.超越逻辑回归:二元变量的结构方程建模及其在调查未观察到的混杂因素中的应用。
BMC Med Res Methodol. 2006 Mar 15;6:13. doi: 10.1186/1471-2288-6-13.
10
A simulation study of sample size for multilevel logistic regression models.多水平逻辑回归模型样本量的模拟研究
BMC Med Res Methodol. 2007 Jul 16;7:34. doi: 10.1186/1471-2288-7-34.

引用本文的文献

1
Accounting for Twins and Other Multiple Births in Perinatal Studies of Live Births Conducted Using Healthcare Administration Data.在利用医疗保健管理数据进行的活产围产期研究中对双胞胎及其他多胞胎的考量。
Epidemiology. 2025 Mar 1;36(2):165-173. doi: 10.1097/EDE.0000000000001809. Epub 2024 Nov 13.
2
We should do better in accounting for multiple births in neonatal randomised trials: a methodological systematic review.我们应该在新生儿随机试验中更好地统计多胞胎情况:一项方法学系统评价。
Arch Dis Child Fetal Neonatal Ed. 2025 Jun 19;110(4):362-368. doi: 10.1136/archdischild-2024-327983.
3
Accounting for Twins and Other Multiple Births in Perinatal Studies Conducted Using Healthcare Administration Data.
在使用医疗保健管理数据进行的围产期研究中对双胞胎及其他多胞胎情况的考量
medRxiv. 2024 Jan 24:2024.01.23.24301685. doi: 10.1101/2024.01.23.24301685.
4
The effect of missing levels of nesting in multilevel analysis.多层次分析中缺失嵌套层次的影响。
Genomics Inform. 2022 Sep;20(3):e34. doi: 10.5808/gi.22052. Epub 2022 Sep 30.
5
Correlation between neonatal outcomes of twins depends on the outcome: secondary analysis of twelve randomised controlled trials.双胞胎新生儿结局的相关性取决于结局:十二项随机对照试验的二次分析。
BJOG. 2018 Oct;125(11):1406-1413. doi: 10.1111/1471-0528.15292. Epub 2018 Jun 25.
6
Severe sepsis in women with group B Streptococcus in pregnancy: an exploratory UK national case-control study.妊娠合并B族链球菌感染的女性严重脓毒症:一项英国全国性探索性病例对照研究
BMJ Open. 2015 Oct 8;5(10):e007976. doi: 10.1136/bmjopen-2015-007976.
7
Analysis of Randomised Trials Including Multiple Births When Birth Size Is Informative.当出生体重具有参考价值时对包含多胞胎的随机试验进行分析。
Paediatr Perinat Epidemiol. 2015 Nov;29(6):567-75. doi: 10.1111/ppe.12228. Epub 2015 Sep 1.
8
Survival, morbidity, growth and developmental delay for babies born preterm in low and middle income countries - a systematic review of outcomes measured.低收入和中等收入国家早产儿的生存、发病率、生长及发育迟缓——对所测量结果的系统评价
PLoS One. 2015 Mar 20;10(3):e0120566. doi: 10.1371/journal.pone.0120566. eCollection 2015.
9
Multiples and parents of multiples prefer same arm randomization of siblings in neonatal trials.多胞胎及其父母在新生儿试验中倾向于对同胞进行同臂随机分组。
J Perinatol. 2015 Mar;35(3):208-13. doi: 10.1038/jp.2014.192. Epub 2014 Oct 23.
10
Using the 7-point checklist as a diagnostic aid for pigmented skin lesions in general practice: a diagnostic validation study.利用 7 分检查表作为一般实践中色素性皮肤损害的诊断辅助手段:一项诊断验证研究。
Br J Gen Pract. 2013 May;63(610):e345-53. doi: 10.3399/bjgp13X667213.