Department of Chemical and Pharmaceutical Sciences, University of Trieste, 34127 Trieste, Italy.
Department of Environmental Chemistry, Pomeranian University in Słupsk, ul. Arciszewskiego 22b, 76-200, Słupsk, Poland.
Sci Total Environ. 2023 Jun 20;878:163084. doi: 10.1016/j.scitotenv.2023.163084. Epub 2023 Mar 28.
The evaluation of the spatial and temporal distribution of pollutants is a crucial issue to assess the anthropogenic burden on the environment. Numerous chemometric approaches are available for data exploration and they have been applied for environmental health assessment purposes. Among the unsupervised methods, Self-Organizing Map (SOM) is an artificial neural network able to handle non-linear problems that can be used for exploratory data analysis, pattern recognition, and variable relationship assessment. Much more interpretation ability is gained when the SOM-based model is merged with clustering algorithms. This review comprises: (i) a description of the algorithm operation principle with a focus on the key parameters used for the SOM initialization; (ii) a description of the SOM output features and how they can be used for data mining; (iii) a list of available software tools for performing calculations; (iv) an overview of the SOM application for obtaining spatial and temporal pollution patterns in the environmental compartments with focus on model training and result visualization; (v) advice on reporting SOM model details in a paper to attain comparability and reproducibility among published papers as well as advice for extracting valuable information from the model results is presented.
污染物的时空分布评估是评估人为环境负担的关键问题。有许多化学计量学方法可用于数据探索,并且已经应用于环境健康评估目的。在无监督方法中,自组织映射(SOM)是一种能够处理非线性问题的人工神经网络,可用于探索性数据分析、模式识别和变量关系评估。当基于 SOM 的模型与聚类算法合并时,会获得更多的解释能力。这篇综述包括:(i)描述算法的操作原理,重点介绍用于 SOM 初始化的关键参数;(ii)描述 SOM 输出特征以及如何将其用于数据挖掘;(iii)用于执行计算的可用软件工具列表;(iv)概述 SOM 在获取环境组分中时空污染模式方面的应用,重点是模型训练和结果可视化;(v)介绍在论文中报告 SOM 模型细节的建议,以实现已发表论文之间的可比性和可重复性,以及从模型结果中提取有价值信息的建议。