物种分布模型中优先抽样的核算

Accounting for preferential sampling in species distribution models.

作者信息

Pennino Maria Grazia, Paradinas Iosu, Illian Janine B, Muñoz Facundo, Bellido José María, López-Quílez Antonio, Conesa David

机构信息

Instituto Español de Oceanografía Centro Oceanográfico de Vigo Vigo Spain.

Departament ďEstadística i Investigació Operativa Universitat de València Valencia Spain.

出版信息

Ecol Evol. 2018 Dec 26;9(1):653-663. doi: 10.1002/ece3.4789. eCollection 2019 Jan.

DOI:10.1002/ece3.4789

PMID:30680145

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6342115/

Abstract

Species distribution models (SDMs) are now being widely used in ecology for management and conservation purposes across terrestrial, freshwater, and marine realms. The increasing interest in SDMs has drawn the attention of ecologists to spatial models and, in particular, to geostatistical models, which are used to associate observations of species occurrence or abundance with environmental covariates in a finite number of locations in order to predict where (and how much of) a species is likely to be present in unsampled locations. Standard geostatistical methodology assumes that the choice of sampling locations is independent of the values of the variable of interest. However, in natural environments, due to practical limitations related to time and financial constraints, this theoretical assumption is often violated. In fact, data commonly derive from opportunistic sampling (e.g., whale or bird watching), in which observers tend to look for a specific species in areas where they expect to find it. These are examples of what is referred to as , which can lead to biased predictions of the distribution of the species. The aim of this study is to discuss a SDM that addresses this problem and that it is more computationally efficient than existing MCMC methods. From a statistical point of view, we interpret the data as a marked point pattern, where the sampling locations form a point pattern and the measurements taken in those locations (i.e., species abundance or occurrence) are the associated marks. Inference and prediction of species distribution is performed using a Bayesian approach, and integrated nested Laplace approximation (INLA) methodology and software are used for model fitting to minimize the computational burden. We show that abundance is highly overestimated at low abundance locations when preferential sampling effects not accounted for, in both a simulated example and a practical application using fishery data. This highlights that ecologists should be aware of the potential bias resulting from preferential sampling and account for it in a model when a survey is based on non-randomized and/or non-systematic sampling.

摘要

物种分布模型（SDMs）目前在生态学中被广泛用于陆地、淡水和海洋领域的管理和保护目的。对物种分布模型日益增长的兴趣已将生态学家的注意力吸引到空间模型上，特别是地质统计模型，该模型用于将物种出现或丰度的观测值与有限数量位置的环境协变量相关联，以便预测一个物种在未采样位置可能出现的地点（以及数量）。标准的地质统计方法假设采样位置的选择与感兴趣变量的值无关。然而，在自然环境中，由于与时间和资金限制相关的实际限制，这一理论假设常常被违反。事实上，数据通常来自机会性采样（例如观鲸或观鸟），在这种情况下，观察者倾向于在他们预期能找到特定物种的区域寻找该物种。这些就是所谓的示例，这可能导致对物种分布的预测产生偏差。本研究的目的是讨论一种解决此问题且比现有MCMC方法计算效率更高的物种分布模型。从统计学角度来看，我们将数据解释为标记点模式，其中采样位置形成点模式，在这些位置进行的测量（即物种丰度或出现情况）是相关标记。使用贝叶斯方法进行物种分布的推断和预测，并使用集成嵌套拉普拉斯近似（INLA）方法和软件进行模型拟合，以最小化计算负担。我们表明，在一个模拟示例和一个使用渔业数据的实际应用中，当不考虑优先采样效应时，低丰度位置的丰度被高估得很严重。这突出表明，当调查基于非随机和/或非系统采样时，生态学家应意识到优先采样可能导致的潜在偏差，并在模型中加以考虑。