Dokter Adriaan M, van Loon E Emiel, Fokkema Wimke, Lameris Thomas K, Nolet Bart A, van der Jeugd Henk P
Dutch Centre for Avian Migration and Demography Netherlands Institute of Ecology Wageningen The Netherlands.
Department of Animal Ecology Netherlands Institute of Ecology Wageningen The Netherlands.
Ecol Evol. 2017 Aug 9;7(18):7362-7369. doi: 10.1002/ece3.3281. eCollection 2017 Sep.
A common problem with observational datasets is that not all events of interest may be detected. For example, observing animals in the wild can difficult when animals move, hide, or cannot be closely approached. We consider time series of events recorded in conditions where events are occasionally missed by observers or observational devices. These time series are not restricted to behavioral protocols, but can be any cyclic or recurring process where discrete outcomes are observed. Undetected events cause biased inferences on the process of interest, and statistical analyses are needed that can identify and correct the compromised detection processes. Missed observations in time series lead to observed time intervals between events at multiples of the true inter-event time, which conveys information on their detection probability. We derive the theoretical probability density function for observed intervals between events that includes a probability of missed detection. Methodology and software tools are provided for analysis of event data with potential observation bias and its removal. The methodology was applied to simulation data and a case study of defecation rate estimation in geese, which is commonly used to estimate their digestive throughput and energetic uptake, or to calculate goose usage of a feeding site from dropping density. Simulations indicate that at a moderate chance to miss arrival events ( = 0.3), uncorrected arrival intervals were biased upward by up to a factor 3, while parameter values corrected for missed observations were within 1% of their true simulated value. A field case study shows that not accounting for missed observations leads to substantial underestimates of the true defecation rate in geese, and spurious rate differences between sites, which are introduced by differences in observational conditions. These results show that the derived methodology can be used to effectively remove observational biases in time-ordered event data.
观测数据集的一个常见问题是,并非所有感兴趣的事件都能被检测到。例如,在野外观察动物时,当动物移动、躲藏或无法近距离接近时,观察就会变得困难。我们考虑在观察者或观测设备偶尔会遗漏事件的条件下记录的事件时间序列。这些时间序列不限于行为协议,而是可以是观察到离散结果的任何循环或重复过程。未检测到的事件会对感兴趣的过程产生有偏差的推断,因此需要进行统计分析来识别和纠正受损的检测过程。时间序列中的观测遗漏会导致观察到的事件之间的时间间隔是真实事件间隔时间的倍数,这传达了它们的检测概率信息。我们推导了包含漏检概率的事件之间观测间隔的理论概率密度函数。提供了方法和软件工具,用于分析存在潜在观测偏差的事件数据及其消除。该方法应用于模拟数据和鹅排便率估计的案例研究,鹅排便率通常用于估计其消化通量和能量摄取,或根据粪便密度计算鹅在觅食地的使用情况。模拟表明,在错过到达事件的概率适中( = 0.3)时,未校正的到达间隔最多向上偏差3倍,而校正了观测遗漏的参数值在其真实模拟值的1%以内。一个实地案例研究表明,不考虑观测遗漏会导致对鹅真实排便率的大幅低估,以及不同地点之间由观测条件差异导致的虚假速率差异。这些结果表明,所推导的方法可用于有效消除时间顺序事件数据中的观测偏差。