The Ecology Centre, School of Life Sciences, The University of Queensland, Brisbane, Qld 4072, Australia Environmental Science, School of Botany, University of Melbourne, Vic. 3010, Australia School of Geography, Planning and Architecture, The University of Queensland, Brisbane, Qld 4072, Australia CSIRO Mathematical and Information Sciences, Cleveland, Qld, Australia School of Earth and Environmental Sciences, University of Adelaide, North Terrace, SA 5005, Australia School of Mathematical Sciences, Queensland University of Technology, Brisbane, Qld 4001, Australia School of Natural Resources, University of Nebraska-Lincoln, Lincoln, NE, USA.
Ecol Lett. 2005 Nov;8(11):1235-46. doi: 10.1111/j.1461-0248.2005.00826.x.
A common feature of ecological data sets is their tendency to contain many zero values. Statistical inference based on such data are likely to be inefficient or wrong unless careful thought is given to how these zeros arose and how best to model them. In this paper, we propose a framework for understanding how zero-inflated data sets originate and deciding how best to model them. We define and classify the different kinds of zeros that occur in ecological data and describe how they arise: either from 'true zero' or 'false zero' observations. After reviewing recent developments in modelling zero-inflated data sets, we use practical examples to demonstrate how failing to account for the source of zero inflation can reduce our ability to detect relationships in ecological data and at worst lead to incorrect inference. The adoption of methods that explicitly model the sources of zero observations will sharpen insights and improve the robustness of ecological analyses.
生态数据集的一个共同特征是它们往往包含许多零值。除非仔细考虑这些零值是如何产生的以及如何对其进行最佳建模,否则基于此类数据的统计推断可能效率低下或错误。在本文中,我们提出了一个理解零膨胀数据集起源并确定最佳建模方法的框架。我们定义并分类了生态数据中出现的不同类型的零值,并描述了它们的产生方式:来自“真实零值”或“虚假零值”观测值。在回顾了零膨胀数据集建模的最新进展之后,我们使用实际示例演示了如果不考虑零膨胀的来源,如何降低我们检测生态数据中关系的能力,最坏的情况是导致不正确的推断。采用明确建模零观测值来源的方法将提高生态分析的洞察力并提高其稳健性。