Ejarque-Gonzalez Elisabet, Butturini Andrea
Departament d'Ecologia, Facultat de Biologia, Universitat de Barcelona, Barcelona, Catalunya, Spain.
PLoS One. 2014 Jun 6;9(6):e99618. doi: 10.1371/journal.pone.0099618. eCollection 2014.
Dissolved organic matter (DOM) is a complex mixture of organic compounds, ubiquitous in marine and freshwater systems. Fluorescence spectroscopy, by means of Excitation-Emission Matrices (EEM), has become an indispensable tool to study DOM sources, transport and fate in aquatic ecosystems. However the statistical treatment of large and heterogeneous EEM data sets still represents an important challenge for biogeochemists. Recently, Self-Organising Maps (SOM) has been proposed as a tool to explore patterns in large EEM data sets. SOM is a pattern recognition method which clusterizes and reduces the dimensionality of input EEMs without relying on any assumption about the data structure. In this paper, we show how SOM, coupled with a correlation analysis of the component planes, can be used both to explore patterns among samples, as well as to identify individual fluorescence components. We analysed a large and heterogeneous EEM data set, including samples from a river catchment collected under a range of hydrological conditions, along a 60-km downstream gradient, and under the influence of different degrees of anthropogenic impact. According to our results, chemical industry effluents appeared to have unique and distinctive spectral characteristics. On the other hand, river samples collected under flash flood conditions showed homogeneous EEM shapes. The correlation analysis of the component planes suggested the presence of four fluorescence components, consistent with DOM components previously described in the literature. A remarkable strength of this methodology was that outlier samples appeared naturally integrated in the analysis. We conclude that SOM coupled with a correlation analysis procedure is a promising tool for studying large and heterogeneous EEM data sets.
溶解有机物(DOM)是有机化合物的复杂混合物,在海洋和淡水系统中普遍存在。借助激发-发射矩阵(EEM)的荧光光谱法已成为研究水生生态系统中DOM来源、迁移和归宿的不可或缺的工具。然而,对大量且异质的EEM数据集进行统计处理,对生物地球化学家来说仍然是一项重大挑战。最近,自组织映射(SOM)被提议作为一种探索大型EEM数据集模式的工具。SOM是一种模式识别方法,它对输入的EEM进行聚类并降低其维度,而不依赖于对数据结构的任何假设。在本文中,我们展示了SOM与成分平面的相关性分析相结合,如何既能用于探索样本间的模式,又能识别单个荧光成分。我们分析了一个大量且异质的EEM数据集,包括在一系列水文条件下、沿60公里下游梯度以及在不同程度人为影响下从河流集水区采集的样本。根据我们的结果,化工废水似乎具有独特且鲜明的光谱特征。另一方面,在暴雨洪水条件下采集的河流样本显示出均匀的EEM形状。成分平面的相关性分析表明存在四种荧光成分,这与文献中先前描述的DOM成分一致。该方法的一个显著优点是异常样本在分析中自然地被整合进来。我们得出结论,SOM与相关性分析程序相结合是研究大量且异质的EEM数据集的一种有前途的工具。