Jones Michael P, Perry Sarah S, Thorne Peter S
Department of Biostatistics, University of Iowa, Iowa City, IA 52242.
Department of Occupational and Environmental Health University of Iowa, Iowa City, IA 52242.
J Agric Biol Environ Stat. 2015 Mar;20(1):83-99. doi: 10.1007/s13253-014-0185-y. Epub 2014 Dec 6.
Toxicological studies often depend on laboratory assays that have thresholds below which environmental pollutants cannot be measured with accuracy. Exposure levels below this limit of detection may well be toxic and hence it is vital to use data analytic methods that handle such left-censored data with as little estimation bias as possible. In an on-going study for which our methodology is developed, levels of residential exposure to polychlorinated biphenyls (PCBs) and the interrelationships of their subtypes (congeners) are characterized. In any given sample many of the congeners may fall below the detection limit. The main problem tackled in this paper is estimation of mean exposure levels and corresponding covariance and correlation matrices for a large number of potentially left-censored measures that have very low bias and are computationally feasible. The proposed methods are likelihood based, using marginal likelihoods for means and variances and pairwise pseudo-likelihoods for correlations and covariances. In the simple bi- variate case, head-to-head comparisons show the proposed methods to be computationally more stable than ordinary maximum likelihood estimates (MLEs) and still maintain comparable bias. When the number of variables is much larger than 2, the proposed methods are far more computationally feasible than MLE. Furthermore, they exhibit much less bias when compared to popular imputation procedures. Analysis of the PCB data uncovered interesting correlational structures.
毒理学研究通常依赖于实验室检测方法,这些方法存在阈值,低于该阈值时环境污染物无法被准确测量。低于此检测限的暴露水平很可能具有毒性,因此使用能以尽可能小的估计偏差处理此类左删失数据的数据分析方法至关重要。在一项正在进行的、我们所开发方法应用其中的研究中,对多氯联苯(PCBs)的住宅暴露水平及其亚型(同系物)的相互关系进行了表征。在任何给定样本中,许多同系物可能低于检测限。本文解决的主要问题是估计大量潜在左删失测量值的平均暴露水平以及相应的协方差和相关矩阵,这些估计具有非常低的偏差且在计算上可行。所提出的方法基于似然性,使用均值和方差的边际似然性以及相关性和协方差的成对伪似然性。在简单的双变量情况下,直接比较表明所提出的方法在计算上比普通最大似然估计(MLEs)更稳定,并且仍然保持可比的偏差。当变量数量远大于2时,所提出的方法在计算上比MLE可行得多。此外,与流行的插补程序相比,它们表现出的偏差要小得多。对多氯联苯数据的分析揭示了有趣的相关结构。