Wageningen Food and Biobased Research, Wageningen, the Netherlands.
School of Biosystems and Food Engineering, University College of Dublin (UCD), Dublin, Ireland.
Anal Chim Acta. 2022 Aug 15;1221:340142. doi: 10.1016/j.aca.2022.340142. Epub 2022 Jul 11.
Predictive latent space near-infrared (NIR) spectral modelling with PLS (Partial Least Squares) has two main tasks that require user input to achieve optimal models. The first is the selection of the optimal pre-processing of NIR spectra and the second is the selection of the optimal number of PLS model components assuming the data is outlier free. Often the two tasks are performed in an exhaustive search to find the best pre-processing as well as the optimal number of model components. We propose a novel approach called meta partial least square (META-PLS) which drops the need for both the pre-processing optimisation and exhaustive search for optimal model components. We utilise the stepwise nature of the PLS algorithm to learn complementary information from different pre-processed forms of the same data set as performed in multiblock pre-processing ensemble models to avoid pre-processing selection but receive help from the pre-processing ensembles, and deploy a weighted randomisation test to decide the optimal number of model components automatically. The performance of the approach for performing automatic NIR spectral modelling is demonstrated with several real data sets.
利用偏最小二乘法 (PLS) 对预测潜在近红外 (NIR) 光谱建模有两个主要任务,需要用户输入以实现最佳模型。第一个任务是选择 NIR 光谱的最佳预处理,第二个任务是假设数据无异常值,选择最佳的 PLS 模型组件数量。通常,这两个任务都是通过穷尽搜索来完成的,以找到最佳的预处理方法和最佳的模型组件数量。我们提出了一种名为元偏最小二乘法 (META-PLS) 的新方法,该方法无需进行预处理优化和对最佳模型组件进行穷尽搜索。我们利用 PLS 算法的逐步性质,从同一数据集的不同预处理形式中学习互补信息,就像在多块预处理集成模型中一样,以避免预处理选择,但同时从预处理集成中获得帮助,并部署加权随机化测试来自动确定最佳模型组件数量。该方法在执行自动 NIR 光谱建模方面的性能通过几个真实数据集得到了验证。