Hu Yongmin, Morgenroth Eberhard, Jacquin Céline
Eawag: Swiss Federal Institute of Aquatic Science and Technology, 8600 Dübendorf, Switzerland; ETH Zürich, Institute of Environmental Engineering, 8093 Zürich, Switzerland.
Eawag: Swiss Federal Institute of Aquatic Science and Technology, 8600 Dübendorf, Switzerland; ETH Zürich, Institute of Environmental Engineering, 8093 Zürich, Switzerland.
Water Res. 2025 Jan 1;268(Pt A):122604. doi: 10.1016/j.watres.2024.122604. Epub 2024 Oct 11.
A currently increasing interest in water reuse is met with the concern about water quality. Excitation-emission matrix (EEM) measurements, which are widely implemented in laboratory analysis, emerge as a promising tool for characterizing both microbial and chemical water qualities in the online monitoring of water reuse systems. However, the robustness of EEM measurements has been rarely validated in actual online monitoring campaigns where predictions are made for new samples independent of those used to establish EEM analysis models, including the popular parallel factor analysis (PARAFAC). In this study, two strategies of conducting PARAFAC were examined for the online monitoring of a greywater reuse system using two EEM datasets from two monitoring periods for model establishment and model testing respectively. With the first strategy that is commonly used in laboratory analyses, an entire EEM datasets from one period was used to establish one PARAFAC model, and the maximum fluorescence intensity (F) of a PARAFAC component was used to predict total cell count (TCC) in another period. However, under the disturbance of dissolved organic matter (DOM) fluorescence in the background, F gave unreliable predictions in model testing. To address this problem, a second and novel strategy was proposed using an EEM clustering and PARAFAC component shift mining technique. This unsupervised algorithm, named K-PARAFACs, automatically groups EEMs into K clusters and on each cluster establishes a cluster-specific PARAFAC model with distinct component shapes. With this method, multiple PARAFAC models were established on one EEM dataset, with each model representing samples with certain TCC ranges and DOM compositions. In model testing, these cluster-specific PARAFAC models served as EEM classifiers. A new sample was not characterized by F but by the cluster-specific model that best fitted the EEM signal of the sample with the least numerical error. The proposed strategy demonstrates its robustness by successfully predicting the TCC trend in test datasets. Our findings suggest that K-PARAFACs is a promising tool that enables robust qualitative monitoring of water reuse systems with background DOM variability.
当前,人们对水的再利用兴趣日益浓厚,但同时也对水质表示担忧。激发-发射矩阵(EEM)测量在实验室分析中广泛应用,成为水再利用系统在线监测中表征微生物和化学水质的一种有前景的工具。然而,在实际的在线监测活动中,EEM测量的稳健性很少得到验证,在这些活动中,要对独立于用于建立EEM分析模型(包括流行的平行因子分析(PARAFAC))的样本的新样本进行预测。在本研究中,使用分别来自两个监测期的两个EEM数据集进行模型建立和模型测试,研究了两种进行PARAFAC的策略用于中水回用系统的在线监测。第一种策略是实验室分析中常用的,使用一个时期的整个EEM数据集建立一个PARAFAC模型,并使用PARAFAC组分的最大荧光强度(F)预测另一个时期的总细胞计数(TCC)。然而,在背景溶解有机物(DOM)荧光的干扰下,F在模型测试中给出了不可靠的预测。为了解决这个问题,提出了第二种新颖的策略,即使用EEM聚类和PARAFAC组分偏移挖掘技术。这种无监督算法,称为K-PARAFACs,自动将EEMs分组为K个簇,并在每个簇上建立具有不同组分形状的特定簇PARAFAC模型。通过这种方法,在一个EEM数据集上建立了多个PARAFAC模型,每个模型代表具有特定TCC范围和DOM组成的样本。在模型测试中,这些特定簇的PARAFAC模型用作EEM分类器。一个新样本不是由F表征,而是由与样本的EEM信号拟合最佳且数值误差最小的特定簇模型表征。所提出的策略通过成功预测测试数据集中的TCC趋势证明了其稳健性。我们的研究结果表明,K-PARAFACs是一种有前景的工具,能够对具有背景DOM变异性的水再利用系统进行稳健的定性监测。