一种用于分析删失数据的精确替换方法。

An accurate substitution method for analyzing censored data.

机构信息

Department of Mathematics, West Virginia University, Morgantown, West Virginia, USA.

出版信息

J Occup Environ Hyg. 2010 Apr;7(4):233-44. doi: 10.1080/15459621003609713.

PMID:20169489

Abstract

When analyzing censored datasets, where one or more measurements are below the limit of detection (LOD), the maximum likelihood estimation (MLE) method is often considered the gold standard for estimating the GM and GSD of the underlying exposure profile. A new and relatively simple substitution method, called beta-substitution, is presented and compared with the MLE method and the common substitution methods (LOD/2 and LOD/square root(2) substitution) when analyzing a left-censored dataset with either single or multiple censoring points. A computer program was used to generate censored exposure datasets for various combinations of true geometric standard deviation (1.2 to 4), percent censoring (1% to 50%), and sample size (5 to 19 and 20 to 100). Each method was used to estimate four parameters of the lognormal distribution: (1) the geometric mean, GM; (2) geometric standard deviation, GSD; (3) 95th percentile, and (4) Mean for the censored datasets. When estimating the GM and GSD, the bias and root mean square error (rMSE) for the beta-substitution method closely matched those for the MLE method, differing by only a small amount, which decreased with increasing sample size. When estimating the Mean and 95th percentile the beta-substitution method bias results closely matched or bettered those for the MLE method. In addition, the overall imprecision, as indicated by the rMSE, was similar to that of the MLE method when estimating the GM, GSD, 95th percentile, and Mean. The bias for the common substitution methods was highly variable, depending strongly on the range of GSD values. The beta-substitution method produced results comparable to the MLE method and is considerably easier to calculate, making it an attractive alternative. In terms of bias it is clearly superior to the commonly used LOD/2 and LOD/square root(2) substitution methods. The rMSE results for the two substitution methods were often comparable to rMSE results for the MLE method, but the substitution methods were often considerably biased.

摘要

当分析存在一个或多个测量值低于检测限（LOD）的删失数据集时，最大似然估计（MLE）方法通常被认为是估计潜在暴露分布的 GM 和 GSD 的黄金标准。本文提出了一种新的相对简单的替代方法，称为β-替代法，并将其与 MLE 方法以及常见的替代方法（LOD/2 和 LOD/平方根(2)替代法）进行比较，分析了具有单个或多个删失点的左删失数据集。使用计算机程序生成了各种真实几何标准差（1.2 到 4）、删失百分比（1%到 50%）和样本大小（5 到 19 和 20 到 100）组合的删失暴露数据集。每种方法都用于估计对数正态分布的四个参数：（1）几何平均值，GM；（2）几何标准差，GSD；（3）95%分位数，（4）删失数据集的均值。在估计 GM 和 GSD 时，β-替代法的偏差和均方根误差（rMSE）与 MLE 法非常接近，仅略有差异，且随着样本量的增加而减小。在估计均值和 95%分位数时，β-替代法的偏差结果与 MLE 法非常接近或更好。此外，rMSE 表示的总体不精确性与 MLE 法估计 GM、GSD、95%分位数和均值时相似。常见替代方法的偏差变化很大，强烈依赖于 GSD 值的范围。β-替代法的结果与 MLE 法相当，并且计算起来要简单得多，因此是一种有吸引力的替代方法。就偏差而言，它明显优于常用的 LOD/2 和 LOD/平方根(2)替代方法。两种替代方法的 rMSE 结果通常与 MLE 法的 rMSE 结果相当，但替代方法通常存在较大的偏差。