Department of Mathematical Sciences, Durham University, Durham, UK.
Bundesamt für Strahlenschutz (BfS), Oberschleissheim, Germany.
Int J Biostat. 2021 May 7;18(1):183-202. doi: 10.1515/ijb-2020-0079.
For the modelling of count data, aggregation of the raw data over certain subgroups or predictor configurations is common practice. This is, for instance, the case for count data biomarkers of radiation exposure. Under the Poisson law, count data can be aggregated without loss of information on the Poisson parameter, which remains true if the Poisson assumption is relaxed towards quasi-Poisson. However, in biodosimetry in particular, but also beyond, the question of how the dispersion estimates for quasi-Poisson models behave under data aggregation have received little attention. Indeed, for real data sets featuring unexplained heterogeneities, dispersion estimates can increase strongly after aggregation, an effect which we will demonstrate and quantify explicitly for some scenarios. The increase in dispersion estimates implies an inflation of the parameter standard errors, which, however, by comparison with random effect models, can be shown to serve a corrective purpose. The phenomena are illustrated by -H2AX foci data as used for instance in radiation biodosimetry for the calibration of dose-response curves.
对于计数数据的建模,常见的做法是对某些子组或预测器配置的原始数据进行聚合。例如,在辐射暴露的计数生物标志物数据中就是这种情况。在泊松定律下,计数数据可以在不丢失泊松参数信息的情况下进行聚合,如果泊松假设放宽为拟泊松,情况仍然如此。然而,特别是在生物剂量学中,以及除此之外,关于拟泊松模型的分散估计在数据聚合下的行为如何的问题,很少受到关注。事实上,对于具有未解释异质性的实际数据集,在聚合后分散估计值可以强烈增加,我们将为一些情况明确地展示和量化这种效应。分散估计值的增加意味着参数标准误差的膨胀,然而,与随机效应模型相比,可以证明其具有纠正目的。这些现象通过例如在辐射生物剂量学中用于校准剂量反应曲线的 -H2AX 焦点数据来说明。