Mirzaei Salehabadi Sedigheh, Sengupta Debasis
Applied Statistical Unit, Indian Statistical Institute, Kolkata, 700108, India.
Lifetime Data Anal. 2016 Oct;22(4):473-503. doi: 10.1007/s10985-015-9345-9. Epub 2015 Sep 21.
In a cross-sectional observational study, time-to-event distribution can be estimated from data on current status or from recalled data on the time of occurrence. In either case, one can treat the data as having been interval censored, and use the nonparametric maximum likelihood estimator proposed by Turnbull (J R Stat Soc Ser B 38:290-295, 1976). However, the chance of recall may depend on the time span between the occurrence of the event and the time of interview. In such a case, the underlying censoring would be informative, rendering the Turnbull estimator inappropriate. In this article, we provide a nonparametric maximum likelihood estimator of the distribution of interest, by using a model adapted to the special nature of the data at hand. We also provide a computationally simple approximation of this estimator, and establish the consistency of both the original and the approximate versions, under mild conditions. Monte Carlo simulations indicate that the proposed estimators have smaller bias than the Turnbull estimator based on incomplete recall data, smaller variance than the Turnbull estimator based on current status data, and smaller mean squared error than both of them. The method is applied to menarcheal data from a recent Anthropometric study of adolescent and young adult females in Kolkata, India.
在一项横断面观察研究中,事件发生时间分布可从当前状态数据或事件发生时间的回忆数据中估计。在这两种情况下,都可将数据视为区间删失数据,并使用Turnbull(《皇家统计学会会刊B辑》38:290 - 295,1976)提出的非参数最大似然估计量。然而,回忆的可能性可能取决于事件发生与访谈时间之间的时间跨度。在这种情况下,潜在的删失将是信息性的,使得Turnbull估计量不适用。在本文中,我们通过使用适合手头数据特殊性质的模型,提供了感兴趣分布的非参数最大似然估计量。我们还提供了该估计量的一个计算简单的近似值,并在温和条件下建立了原始版本和近似版本的一致性。蒙特卡罗模拟表明,所提出的估计量比基于不完全回忆数据的Turnbull估计量偏差更小,比基于当前状态数据的Turnbull估计量方差更小,且比这两者的均方误差都更小。该方法应用于印度加尔各答近期一项青少年和年轻成年女性人体测量研究中的初潮数据。