Suppr超能文献

当大多数观测值低于检测限时最大似然估计程序的局限性。

Limitations of maximum likelihood estimation procedures when a majority of the observations are below the limit of detection.

作者信息

Jain Ram B, Wang Richard Y

机构信息

Centers for Disease Control and Prevention, Mail Stop F-47, 4770 Buford Highway, Atlanta, Georgia 30341, USA.

出版信息

Anal Chem. 2008 Jun 15;80(12):4767-72. doi: 10.1021/ac8003743. Epub 2008 May 20.

Abstract

We evaluated the performance of maximum likelihood estimation procedures to estimate the population mean and standard deviation (SD) of log-transformed data sets containing serum or urinary analytical measurements with 50-80% of observations below the limit of detection (LOD). We found that maximum likelihood procedures are limited in their ability to accurately estimate the population mean and SD when the percent of censored data was large and sample size was small. The means were more likely to be underestimated and the SDs were more likely to be overestimated using these procedures. When the sample size, N, was <or=100 and the percent of observations below the LOD, P, was >or=70%, the procedure without imputations performed better than those with imputations. However, the procedure with multiple imputations performed better than or was comparable to other procedures when N was at least 100. This finding was consistent with the improved estimates of the mean and SD in a data set ( N = 113) of polychlorinated biphenyl (PCB) concentrations using multiple imputations. We recommend the use of maximum likelihood procedures with multiple imputation when N >or= 100 and P < 70%. A maximum likelihood procedure without imputation should be preferred when N < 100 and P >or= 70%. However, it should be the expected that biases for both mean and SD in these circumstances may be unacceptably high.

摘要

我们评估了最大似然估计程序在估计对数转换后数据集的总体均值和标准差(SD)时的性能,这些数据集包含血清或尿液分析测量值,其中50 - 80%的观测值低于检测限(LOD)。我们发现,当删失数据的百分比很大且样本量很小时,最大似然程序在准确估计总体均值和标准差方面能力有限。使用这些程序时,均值更有可能被低估,而标准差更有可能被高估。当样本量N≤100且低于检测限的观测值百分比P≥70%时,不进行插补的程序比进行插补的程序表现更好。然而,当N至少为100时,多重插补程序的表现优于其他程序或与之相当。这一发现与使用多重插补对多氯联苯(PCB)浓度数据集(N = 113)中的均值和标准差进行的改进估计一致。我们建议,当N≥100且P < 70%时,使用带有多重插补的最大似然程序。当N < 100且P≥70%时,应优先选择不进行插补的最大似然程序。然而,在这些情况下,均值和标准差的偏差可能会高到不可接受的程度,这是可以预料到的。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验