Department of Knowledge-Based Mathematical Systems, Johannes Kepler University Linz, Austria.
BCAM - Basque Center for Applied Mathematics, Bilbao, Spain.
Anal Chim Acta. 2018 Jul 12;1013:1-12. doi: 10.1016/j.aca.2018.02.003. Epub 2018 Feb 8.
The physico-chemical properties of Melamine Formaldehyde (MF) based thermosets are largely influenced by the degree of polymerization (DP) in the underlying resin. On-line supervision of the turbidity point by means of vibrational spectroscopy has recently emerged as a promising technique to monitor the DP of MF resins. However, spectroscopic determination of the DP relies on chemometric models, which are usually sensitive to drifts caused by instrumental and/or sample-associated changes occurring over time. In order to detect the time point when drifts start causing prediction bias, we here explore a universal drift detector based on a faded version of the Page-Hinkley (PH) statistic, which we test in three data streams from an industrial MF resin production process. We employ committee disagreement (CD), computed as the variance of model predictions from an ensemble of partial least squares (PLS) models, as a measure for sample-wise prediction uncertainty and use the PH statistic to detect changes in this quantity. We further explore supervised and unsupervised strategies for (semi-)automatic model adaptation upon detection of a drift. For the former, manual reference measurements are requested whenever statistical thresholds on Hotelling's T and/or Q-Residuals are violated. Models are subsequently re-calibrated using weighted partial least squares in order to increase the influence of newer samples, which increases the flexibility when adapting to new (drifted) states. Unsupervised model adaptation is carried out exploiting the dual antecedent-consequent structure of a recently developed fuzzy systems variant of PLS termed FLEXFIS-PLS. In particular, antecedent parts are updated while maintaining the internal structure of the local linear predictors (i.e. the consequents). We found improved drift detection capability of the CD compared to Hotelling's T and Q-Residuals when used in combination with the proposed PH test. Furthermore, we found that active selection of samples by active learning (AL) used for subsequent model adaptation is advantageous compared to passive (random) selection in case that a drift leads to persistent prediction bias allowing more rapid adaptation at lower reference measurement rates. Fully unsupervised adaptation using FLEXFIS-PLS could improve predictive accuracy significantly for light drifts but was not able to fully compensate for prediction bias in case of significant lack of fit w.r.t. the latent variable space.
三聚氰胺甲醛(MF)基热固性塑料的物理化学性质在很大程度上受基础树脂聚合度(DP)的影响。最近,通过振动光谱对浊点进行在线监测已成为一种很有前途的监测 MF 树脂 DP 的技术。然而,光谱法测定 DP 依赖于化学计量模型,这些模型通常对由于仪器和/或随时间变化的样品相关变化引起的漂移很敏感。为了检测漂移开始导致预测偏差的时间点,我们在这里探索了一种基于褪色版 Page-Hinkley(PH)统计量的通用漂移检测器,我们在来自工业 MF 树脂生产过程的三个数据流中对其进行了测试。我们采用委员会分歧(CD)作为衡量样本预测不确定性的指标,其计算方法为来自多个偏最小二乘法(PLS)模型的模型预测方差,并使用 PH 统计量来检测该量的变化。我们进一步探索了在检测到漂移时进行(半)自动模型自适应的有监督和无监督策略。对于前者,每当统计量 Hotelling's T 和/或 Q-Residuals 的统计阈值被违反时,都需要手动参考测量。随后,使用加权偏最小二乘法重新校准模型,以增加对较新样本的影响,从而在适应新(漂移)状态时提高灵活性。无监督模型自适应是通过最近开发的一种称为 FLEXFIS-PLS 的 PLS 的模糊系统变体来实现的,该变体利用了其双重前因后果结构。具体来说,在保持局部线性预测器(即后果)的内部结构的同时更新前因部分。与所提出的 PH 测试结合使用时,我们发现 CD 比 Hotelling's T 和 Q-Residuals 具有更好的漂移检测能力。此外,我们发现,在漂移导致持续预测偏差的情况下,主动学习(AL)主动选择样本进行后续模型自适应比被动(随机)选择更有利,这允许在更低的参考测量率下更快地适应。使用 FLEXFIS-PLS 进行完全无监督自适应可以显著提高轻度漂移的预测准确性,但在潜在变量空间拟合不足的情况下,无法完全补偿预测偏差。