Suppr超能文献

用于多重比较的聚类和过度分散计数数据建模的比较研究

A comparison study on modeling of clustered and overdispersed count data for multiple comparisons.

作者信息

Kruppa Jochen, Hothorn Ludwig

机构信息

Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Institute of Biometry and Clinical Epidemiology, Berlin, Germany.

Berlin Institute of Health (BIH), Berlin, Germany.

出版信息

J Appl Stat. 2020 Jul 3;48(16):3220-3232. doi: 10.1080/02664763.2020.1788518. eCollection 2021.

Abstract

Data collected in various scientific fields are count data. One way to analyze such data is to compare the individual levels of the factor treatment using multiple comparisons. However, the measured individuals are often clustered - e.g. according to litter or rearing. This must be considered when estimating the parameters by a repeated measurement model. In addition, ignoring the overdispersion to which count data is prone leads to an increase of the type one error rate. We carry out simulation studies using several different data settings and compare different multiple contrast tests with parameter estimates from generalized estimation equations and generalized linear mixed models in order to observe coverage and rejection probabilities. We generate overdispersed, clustered count data in small samples as can be observed in many biological settings. We have found that the generalized estimation equations outperform generalized linear mixed models if the variance-sandwich estimator is correctly specified. Furthermore, generalized linear mixed models show problems with the convergence rate under certain data settings, but there are model implementations with lower implications exists. Finally, we use an example of genetic data to demonstrate the application of the multiple contrast test and the problems of ignoring strong overdispersion.

摘要

在各个科学领域收集的数据都是计数数据。分析此类数据的一种方法是使用多重比较来比较因素处理的各个水平。然而,所测量的个体通常是聚类的——例如根据窝或饲养情况。在通过重复测量模型估计参数时必须考虑这一点。此外,忽略计数数据容易出现的过度离散会导致一类错误率增加。我们使用几种不同的数据设置进行模拟研究,并将不同的多重对比检验与广义估计方程和广义线性混合模型的参数估计进行比较,以观察覆盖率和拒绝概率。我们生成了在许多生物学环境中都能观察到的小样本中过度离散、聚类的计数数据。我们发现,如果正确指定方差-三明治估计量,广义估计方程的表现优于广义线性混合模型。此外,广义线性混合模型在某些数据设置下显示出收敛速度问题,但存在影响较小的模型实现。最后,我们使用一个遗传数据的例子来演示多重对比检验的应用以及忽略强过度离散的问题。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4218/9042126/e19e3879cc60/CJAS_A_1788518_F0001_OC.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验