Faculty of Civil Engineering, Ton Duc Thang University, Ho Chi Minh City, Viet Nam.
GeoInformatic Unit, Geography Section, School of Humanities, Universiti Sains Malaysia, 11800, Pulau Pinang, Malaysia.
Mar Pollut Bull. 2021 Sep;170:112639. doi: 10.1016/j.marpolbul.2021.112639. Epub 2021 Jul 14.
Dissolved oxygen (DO) is an important indicator of river health for environmental engineers and ecological scientists to understand the state of river health. This study aims to evaluate the reliability of four feature selector algorithms i.e., Boruta, genetic algorithm (GA), multivariate adaptive regression splines (MARS), and extreme gradient boosting (XGBoost) to select the best suited predictor of the applied water quality (WQ) parameters; and compare four tree-based predictive models, namely, random forest (RF), conditional random forests (cForest), RANdom forest GEneRator (Ranger), and XGBoost to predict the changes of dissolved oxygen (DO) in the Klang River, Malaysia. The total features including 15 WQ parameters from monitoring site data and 7 hydrological components from remote sensing data. All predictive models performed well as per the features selected by the algorithms XGBoost and MARS in terms applied statistical evaluators. Besides, the best performance noted in case of XGBoost predictive model among all applied predictive models when the feature selected by MARS and XGBoost algorithms, with the coefficient of determination (R) values of 0.84 and 0.85, respectively, nonetheless the marginal performance came up by Boruta-XGBoost model on in this scenario.
溶解氧(DO)是环境工程师和生态科学家了解河流健康状况的重要指标。本研究旨在评估四种特征选择算法(Boruta、遗传算法(GA)、多元自适应回归样条(MARS)和极端梯度提升(XGBoost))的可靠性,以选择最适合应用水质(WQ)参数的预测因子;并比较四种基于树的预测模型,即随机森林(RF)、条件随机森林(cForest)、随机森林生成器(Ranger)和 XGBoost,以预测马来西亚 Klang 河溶解氧(DO)的变化。总特征包括监测站点数据的 15 个 WQ 参数和遥感数据的 7 个水文学成分。所有预测模型的表现都很好,根据算法 XGBoost 和 MARS 选择的特征,在应用的统计评估器方面。此外,在 MARS 和 XGBoost 算法选择特征的情况下,XGBoost 预测模型的性能最好,决定系数(R)值分别为 0.84 和 0.85,但在这种情况下,Boruta-XGBoost 模型的性能略有提高。