He Hua, Tang Wan, Kelly Tanika, Li Shengxu, He Jiang
Department of Epidemiology, Tulane University School of Public Health and Tropical Medicine, New Orleans, LA, USA.
Department of Biostatistics and Data Science, Tulane University School of Public Health and Tropical Medicine, New Orleans, LA, USA.
Stat Methods Med Res. 2020 Aug;29(8):2179-2197. doi: 10.1177/0962280219885985. Epub 2019 Nov 18.
Measures of substance concentration in urine, serum or other biological matrices often have an assay limit of detection. When concentration levels fall below the limit, the exact measures cannot be obtained. Instead, the measures are censored as only partial information that the levels are under the limit is known. Assuming the concentration levels are from a single population with a normal distribution or follow a normal distribution after some transformation, Tobit regression models, or censored normal regression models, are the standard approach for analyzing such data. However, in practice, it is often the case that the data can exhibit more censored observations than what would be expected under the Tobit regression models. One common cause is the heterogeneity of the study population, caused by the existence of a latent group of subjects who lack the substance measured. For such subjects, the measurements will always be under the limit. If a censored normal regression model is appropriate for modeling the subjects with the substance, the whole population follows a mixture of a censored normal regression model and a degenerate distribution of the latent class. While there are some studies on such mixture models, a fundamental question about testing whether such mixture modeling is necessary, i.e. whether such a latent class exists, has not been studied yet. In this paper, three tests including Wald test, likelihood ratio test and score test are developed for testing the existence of such latent class. Simulation studies are conducted to evaluate the performance of the tests, and two real data examples are employed to illustrate the tests.
尿液、血清或其他生物基质中物质浓度的测量通常有一个检测限。当浓度水平低于该限时,无法获得确切的测量值。相反,这些测量值被视为删失数据,因为仅知道水平低于该限这一部分信息。假设浓度水平来自具有正态分布的单一总体,或者在经过某种变换后遵循正态分布,Tobit回归模型(即删失正态回归模型)是分析此类数据的标准方法。然而,在实际中,数据往往会出现比Tobit回归模型预期更多的删失观测值。一个常见原因是研究总体的异质性,这是由一组缺乏所测物质的潜在受试者的存在导致的。对于这类受试者,测量值将始终低于该限。如果删失正态回归模型适用于对有该物质的受试者进行建模,那么整个总体遵循删失正态回归模型和潜在类别的退化分布的混合。虽然有一些关于此类混合模型的研究,但关于检验这种混合建模是否必要(即是否存在这样的潜在类别)的一个基本问题尚未得到研究。本文开发了包括Wald检验、似然比检验和得分检验在内的三种检验方法来检验这种潜在类别的存在。进行了模拟研究以评估这些检验的性能,并使用两个实际数据示例来说明这些检验。