Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC.
Medical Practice Evaluation Center, Massachusetts General Hospital, Boston, MA; Division of General Academic Pediatrics, Massachusetts General Hospital, Boston, MA.
Ann Epidemiol. 2021 Jan;53:106-108.e1. doi: 10.1016/j.annepidem.2020.06.005. Epub 2020 Oct 20.
In prospective cohort studies, incidence is typically estimated by the ratio of the observed number of events to person-time at risk. This crude estimator is consistent for the true population incidence rate (IR) under mild assumptions. Here we consider a different setting where only cross-sectional data are available, that is, at a single time point, participants are evaluated to identify whether they have previously had the event of interest.
Unlike the prospective cohort data setting, for cross-sectional data, the crude IR estimator is biased. Instead, the maximum likelihood estimator (MLE) may be used. Although the MLE does not have a simple closed form, it is consistent and easy to compute using statistical software. To compare the bias of the MLE and the crude estimator, a simulation was conducted.
The crude estimator underestimated the true incidence, whereas the MLE was approximately unbiased. In general, bias of the crude estimator tended to be roughly one to two orders of magnitude larger (in absolute value) than the MLE.
Under cross-sectional data with exact event times unknown, the MLE of the IR is straightforward to calculate, more accurate than the crude IR estimator, and consistent provided the hazard is constant.
在前瞻性队列研究中,发病率通常通过观察到的事件数与风险人时的比例来估计。在一些轻微的假设下,这个粗略的估计量对于真实的人群发病率(IR)是一致的。在这里,我们考虑了一种不同的情况,即只有横断面数据可用,也就是说,在一个单一的时间点,评估参与者是否以前发生过感兴趣的事件。
与前瞻性队列数据设置不同,对于横断面数据,粗略的发病率估计值存在偏差。相反,可以使用最大似然估计(MLE)。尽管 MLE 没有简单的封闭形式,但它是一致的,并且可以使用统计软件轻松计算。为了比较 MLE 和粗略估计值的偏差,进行了模拟。
粗略估计值低估了真实的发病率,而 MLE 则近似无偏。一般来说,粗略估计值的偏差往往比 MLE 大一个或两个数量级(绝对值)。
在确切的事件时间未知的横断面数据下,IR 的 MLE 易于计算,比粗略的 IR 估计值更准确,并且在危险是恒定的情况下是一致的。