Chu Haitao, Nie Lei, Cole Stephen R
Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD 21205, USA.
Stat Med. 2006 Aug 15;25(15):2647-57. doi: 10.1002/sim.2503.
Often in randomized clinical trials and observational cohort studies, a non-negative continuously distributed response variable is measured in treatment and control groups. In the presence of true zeros for the response variable, a two-part zero-inflated log-normal model (which assumes that the data has a probability mass at zero and a continuous response for values greater than zero) is usually recommended. However, in some environmental health and human immunodeficiency virus (HIV) studies, quantitative assays for metabolites of toxicants, or quantitative HIV RNA measurements are subject to left-censoring due to values falling below the limit of detection (LD). Here, a zero-inflated log-normal mixture model is often suggested since true zeros are indistinguishable from left-censored values due to the LD. When the probabilities of true zeros in the two groups are not restricted to be equal, the information contributed by values falling below LD is used only to estimate the probability of true zeros in the context of mixture distributions. We derived the required sample size to assess the effect of a treatment in the context of mixture models with equal and unequal variances based on the left-truncated log-normal distribution. Methods for calculation of statistical power are also presented. We calculate the required sample size and power for a recent study estimating the effect of oltipraz on reducing urinary levels of the hydroxylated metabolite aflatoxin M(1) (AFM(1)) in a randomized, placebo-controlled, double-blind phase IIa chemoprevention trial in Qidong, China. A Monte Carlo simulation study is conducted to investigate the performance of the proposed methods.
在随机临床试验和观察性队列研究中,常常会在治疗组和对照组中测量一个非负的连续分布响应变量。当响应变量存在真正的零值时,通常推荐使用两部分零膨胀对数正态模型(该模型假设数据在零处有概率质量,且对于大于零的值有连续响应)。然而,在一些环境卫生和人类免疫缺陷病毒(HIV)研究中,由于值低于检测限(LD),有毒物质代谢物的定量分析或HIV RNA定量测量会受到左删失的影响。在此,常常建议使用零膨胀对数正态混合模型,因为由于LD的存在,真正的零值与左删失值无法区分。当两组中真正零值的概率不限制为相等时,低于LD的值所提供的信息仅用于在混合分布的背景下估计真正零值的概率。我们基于左截断对数正态分布,推导了在具有相等和不相等方差的混合模型背景下评估治疗效果所需的样本量。还介绍了统计功效的计算方法。我们为中国启东一项随机、安慰剂对照、双盲IIa期化学预防试验中估计奥替普拉对降低羟基化代谢物黄曲霉毒素M(1)(AFM(1))尿水平的效果的近期研究计算了所需样本量和功效。进行了一项蒙特卡罗模拟研究以调查所提出方法的性能。