Key Laboratory for Environment and Disaster Monitoring and Evaluation of Hubei, Innovation Academy for Precision Measurement Science and Technology, Chinese Academy of Sciences, Wuhan 430071, China.
Key Laboratory for Environment and Disaster Monitoring and Evaluation of Hubei, Innovation Academy for Precision Measurement Science and Technology, Chinese Academy of Sciences, Wuhan 430071, China.
Sci Total Environ. 2024 Sep 15;943:173748. doi: 10.1016/j.scitotenv.2024.173748. Epub 2024 Jun 8.
In many coastal cities around the world, continuing water degradation threatens the living environment of humans and aquatic organisms. To assess and control the water pollution situation, this study estimated the Biochemical Oxygen Demand (BOD) concentration of Hong Kong's marine waters using remote sensing and an improved machine learning (ML) method. The scheme was derived from four ML algorithms (RBF, SVR, RF, XGB) and calibrated using a large amount (N > 1000) of in-situ BOD data. Based on labeled datasets with different preprocessing, i.e., the original BOD, the log(BOD), and label distribution smoothing (LDS), three types of models were trained and evaluated. The results highlight the superior potential of the LDS-based model to improve BOD estimate by dealing with imbalanced training dataset. Additionally, XGB and RF outperformed RBF and SVR when the model was developed using log(BOD) or LDS(BOD). Over two decades, the BOD concentration of Hong Kong marine waters in the autumn (Sep. to Nov.) shows a downward trend, with significant decreases in Deep Bay, Western Buffer, Victoria Harbour, Eastern Buffer, Junk Bay, Port Shelter, and the Tolo Harbour and Channel. Principal component analysis revealed that nutrient levels emerged as the predominant factor in Victoria Harbour and the interior of Deep Bay, while chlorophyll-related and physical parameters were dominant in Southern, Mirs Bay, Northwestern, and the outlet of Deep Bay. LDS provides a new perspective to improve ML-based water quality estimation by alleviating the imbalance in the labeled dataset. Overall, the remotely sensed BOD can offer insight into the spatial-temporal distribution of organic matter in Hong Kong coastal waters and valuable guidance for the pollution control.
在世界上许多沿海城市,持续的水质恶化威胁着人类和水生生物的生存环境。为了评估和控制水污染状况,本研究使用遥感和改进的机器学习(ML)方法来估算香港海域的生化需氧量(BOD)浓度。该方案源自四种 ML 算法(RBF、SVR、RF、XGB),并使用大量(N>1000)原位 BOD 数据进行校准。基于具有不同预处理的标记数据集,即原始 BOD、log(BOD)和标签分布平滑(LDS),训练和评估了三种类型的模型。结果突出了基于 LDS 的模型通过处理不平衡的训练数据集来提高 BOD 估计的潜力。此外,当使用 log(BOD)或 LDS(BOD)开发模型时,XGB 和 RF 优于 RBF 和 SVR。二十多年来,香港秋季(9 月至 11 月)海域的 BOD 浓度呈下降趋势,深井湾、西部缓冲区、维多利亚港、东部缓冲区、船湾、港口避风塘和吐露港及海峡的 BOD 浓度显著下降。主成分分析表明,营养水平是维多利亚港和深井湾内部的主要因素,而叶绿素相关和物理参数在南海、大鹏湾、西北和深井湾出口处占主导地位。LDS 通过缓解标记数据集的不平衡提供了一种改进基于 ML 的水质估计的新视角。总的来说,遥感 BOD 可以深入了解香港沿海水域有机物的时空分布,为污染控制提供有价值的指导。