Department of Mathematics, Faculty of Sciences, University of Oviedo, 33007, Oviedo, Spain.
Department of Construction and Manufacturing Engineering, University of Oviedo, 33204, Gijón, Spain.
Environ Sci Pollut Res Int. 2018 Aug;25(23):22658-22671. doi: 10.1007/s11356-018-2219-4. Epub 2018 May 30.
Cyanotoxins are a type of cyanobacteria that is poisonous and poses a health threat in waters that could be used for drinking or recreational purposes. Thus, it is necessary to predict their presence to avoid risks. This paper presents a nonparametric machine learning approach using a gradient boosted regression tree model (GBRT) for prediction of cyanotoxin contents from cyanobacterial concentrations determined experimentally in a reservoir located in the north of Spain. GBRT models seek and obtain good predictions in highly nonlinear problems, like the one treated here, where the studied variable presents low concentrations of cyanotoxins mixed with high concentration peaks. Two types of results have been obtained: firstly, the model allows the ranking or the dependent variables according to its importance in the model. Finally, the high performance and the simplicity of the model make the gradient boosted tree method attractive compared to conventional forecasting techniques.
蓝藻毒素是一种有毒的蓝藻,对可用于饮用或娱乐的水体构成健康威胁。因此,有必要预测其存在以避免风险。本文提出了一种基于梯度提升回归树模型(GBRT)的非参数机器学习方法,用于预测西班牙北部一个水库中实验测定的蓝藻浓度的蓝藻毒素含量。GBRT 模型适用于高度非线性问题,如本文所处理的问题,其中研究变量的蓝藻毒素浓度较低,同时存在高浓度峰值。得到了两种结果:首先,该模型允许根据其在模型中的重要性对因变量进行排序。最后,与传统预测技术相比,梯度提升树方法具有较高的性能和简单性,因此具有吸引力。