ADAMA Deutschland GmbH, Cologne, Germany.
Department of Plant and Environmental Sciences, University of Copenhagen, Copenhagen, Denmark.
Regul Toxicol Pharmacol. 2021 Apr;121:104871. doi: 10.1016/j.yrtph.2021.104871. Epub 2021 Jan 22.
It is tempting to base (eco-)toxicological assay evaluation solely on statistical significance tests. The approach is stringent, objective and facilitates binary decisions. However, tests according to null hypothesis statistical testing (NHST) are thought experiments that rely heavily on assumptions. The generic and unreflected application of statistical tests has been called "mindless" by Gigerenzer. While statistical tests have an appropriate application domain, the present work investigates how unreflected testing may affect toxicological assessments. Dunnett multiple-comparison and Williams trend testing and their compatibility intervals are compared with dose-response-modelling in case studies, where data do not follow textbook behavior, nor behave as expected from a toxicological point of view. In such cases, toxicological assessments based only on p-values may be biased and biological evaluations based on plausibility may be prioritized. If confidence in a negative assay outcome cannot be established, further data may be needed for a robust toxicological assessment.
基于统计显著性检验来评估(生态)毒理学检测结果具有一定的吸引力。这种方法严格、客观,并且有助于做出二分类决策。然而,根据零假设统计检验(NHST)的检测方法只是一种思维实验,严重依赖于假设。统计检验的普遍且未经反思的应用被吉仁泽(Gigerenzer)称为“盲目”。虽然统计检验有其适用的领域,但本研究探讨了未经反思的检验如何影响毒理学评估。在案例研究中,我们比较了邓尼特多重比较和威廉姆斯趋势检验及其兼容性区间与剂量反应建模,这些数据不符合教科书的行为,也不符合毒理学角度的预期。在这种情况下,仅基于 p 值的毒理学评估可能存在偏差,基于合理性的生物学评估可能需要优先考虑。如果不能确定阴性检测结果的可信度,那么可能需要进一步的数据来进行稳健的毒理学评估。