Haseman J K, Soares E R
Mutat Res. 1976 Dec;41(2-3):277-88. doi: 10.1016/0027-5107(76)90101-9.
In dominant lethal testing fetal death is generally assumed to follow either a Poisson or binomial distribution. However, both of these models were found to be inappropriate when three large sets of mouse control data and other data sets from the literature were examined. The validity of statistical test procedures based on these inappropriate models was then studied in detail. It was found that chi-square tests (which assume an underlying binomial distribution) may seriously exaggerate the level of significance and hence should not be used. In contrast, the inappropriateness of the underlying Poisson or binomial model appeared to have little effect on the validity of pairwise comparisons by analysis of variance procedures. Unlike chi-square, these procedures regard the pregnant female rather than the individual implant as the experimental unit. However, a statistical analysis of dominant lethal data generally involves more than a series of pairwise comparisons, and it is unclear how an invalid underlying model may affect statistical test procedures in this more complex situation. Moreover, it is difficult to justify the use of statistical models that are demonstrably invalid when a reasonable alternative exists. Thus, until a satisfactory parametric model can be found and appropriate test procedures derived, we prefer to analyze dominant lethal data by non-parametric (distribution-free) methods. Proportion of dead implants per female appears to be a more meaningful measure of fetal death than number of dead implants per female for several seasons which include (1) analyses based on proportions take the total number of implants per female into account and (2) analyses based on proportions make more reasonable assumptions concerning pre-implantation losses and are more powerful when such losses occur. Despite our concern with the appropriateness of the underlying model, in practice we have found few instances in which non-parametric and analysis of variance procedures have led to markedly different conclusions.
在显性致死试验中,通常假定胎儿死亡服从泊松分布或二项分布。然而,当对三组大量的小鼠对照数据以及文献中的其他数据集进行检查时,发现这两种模型都不合适。随后详细研究了基于这些不合适模型的统计检验程序的有效性。结果发现,卡方检验(假定服从潜在的二项分布)可能会严重夸大显著性水平,因此不应使用。相比之下,潜在的泊松或二项模型的不合适性似乎对通过方差分析程序进行的两两比较的有效性影响不大。与卡方检验不同,这些程序将怀孕雌性而非单个植入物视为实验单位。然而,显性致死数据的统计分析通常涉及的不仅仅是一系列两两比较,而且尚不清楚在这种更复杂的情况下,一个无效的潜在模型可能如何影响统计检验程序。此外,当存在合理的替代方案时,很难证明使用明显无效的统计模型是合理的。因此,在找到令人满意的参数模型并推导出合适的检验程序之前,我们更倾向于通过非参数(无分布)方法分析显性致死数据。对于几个季节来说,每只雌性死亡植入物的比例似乎比每只雌性死亡植入物的数量更能有效衡量胎儿死亡情况,原因如下:(1)基于比例的分析考虑了每只雌性植入物的总数;(2)基于比例的分析对植入前损失做出了更合理的假设,并且在发生此类损失时更具效力。尽管我们关注潜在模型的适用性,但在实践中,我们发现很少有非参数方法和方差分析程序得出明显不同结论的情况。