Bellacicco Marco, Vellucci Vincenzo, Scardi Michele, Barbieux Marie, Marullo Salvatore, D'Ortenzio Fabrizio
Sorbonne Université, CNRS, Laboratoire d'Océanographie de Villefranche, LOV, F-06230 Villefranche-sur-Mer, France.
Italian National Agency for New Technologies, Energy and Sustainable Economic Development (ENEA), 00044 Frascati, Italy.
Sensors (Basel). 2019 Jul 9;19(13):3032. doi: 10.3390/s19133032.
Linear regression is widely used in applied sciences and, in particular, in satellite optical oceanography, to relate dependent to independent variables. It is often adopted to establish empirical algorithms based on a finite set of measurements, which are later applied to observations on a larger scale from platforms such as autonomous profiling floats equipped with optical instruments (e.g., Biogeochemical Argo floats; BGC-Argo floats) and satellite ocean colour sensors (e.g., SeaWiFS, VIIRS, OLCI). However, different methods can be applied to a given pair of variables to determine the coefficients of the linear equation fitting the data, which are therefore not unique. In this work, we quantify the impact of the choice of "regression method" (i.e., either type-I or type-II) to derive bio-optical relationships, both from theoretical perspectives and by using specific examples. We have applied usual regression methods to an in situ data set of particulate organic carbon (POC), total chlorophyll- (TChla), optical particulate backscattering coefficient (b), and 19 years of monthly TChla and b ocean colour data. Results of the regression analysis have been used to calculate phytoplankton carbon biomass (C) and POC from: i) BGC-Argo float observations; ii) oceanographic cruises, and iii) satellite data. These applications enable highlighting the differences in C and POC estimates relative to the choice of the method. An analysis of the statistical properties of the dataset and a detailed description of the hypothesis of the work drive the selection of the linear regression method.
线性回归在应用科学中被广泛使用,尤其是在卫星光学海洋学中,用于关联因变量和自变量。它常被用于基于有限的测量数据集建立经验算法,这些算法随后被应用于来自配备光学仪器的自主剖面浮标(如生物地球化学Argo浮标;BGC-Argo浮标)和卫星海洋颜色传感器(如SeaWiFS、VIIRS、OLCI)等平台的更大规模观测。然而,对于给定的一对变量,可以应用不同的方法来确定拟合数据的线性方程的系数,因此这些系数不是唯一的。在这项工作中,我们从理论角度并通过具体示例量化了选择“回归方法”(即I型或II型)对推导生物光学关系的影响。我们将常用的回归方法应用于颗粒有机碳(POC)、总叶绿素-a(TChla)、光学颗粒后向散射系数(b)的现场数据集以及19年的月度TChla和b海洋颜色数据。回归分析的结果已被用于从以下方面计算浮游植物碳生物量(C)和POC:i)BGC-Argo浮标观测;ii)海洋学巡航,以及iii)卫星数据。这些应用能够突出相对于方法选择的C和POC估计值的差异。对数据集统计特性的分析以及对工作假设的详细描述推动了线性回归方法的选择。