Li Tianqi, Le Hieu Minh, Handoyo Renato, Pagliano Enea, Hu Yaxi
Department of Chemistry, Carleton University, 1125 Colonel By Drive, Ottawa, Ontario, K1S 5B6, Canada.
Department of Chemistry, Carleton University, 1125 Colonel By Drive, Ottawa, Ontario, K1S 5B6, Canada; Metrology Research Center, National Research Council Canada, 1200 Montreal Road, Ottawa, Ontario, K1A 0R6, Canada.
Talanta. 2025 Nov 1;294:128239. doi: 10.1016/j.talanta.2025.128239. Epub 2025 Apr 29.
The demand for plant-based milk alternatives (PBMA) has increased substantially, especially among consumers allergic and/or intolerant to animal dairy products and consumers attentive to environmental sustainability. Concurrent with market expansion and higher production costs, fraudulent activities involving PBMA are of great concern. In order to validate authenticity of PBMA products, a headspace solid-phase microextraction gas chromatography mass spectrometry method (HS-SPME-GC-MS) was developed and optimized to differentiate 8 types of PBMA (i.e., almonds, cashews, hazelnuts, walnuts, oats, peanuts, pistachios, and macadamias) on the basis of their volatile metabolic profile (i.e., volatilome). A total of 80 samples (i.e., 10 replicates for each type of PBMA) were analyzed using HS-SPME-GC-MS and subjected to data preprocessing and classification model construction using machine learning algorithms. Approximately 143 volatile compounds were identified based on the MS-DIAL database (Version: 4.9.221218). Three machine learning algorithms were tested and among them, Support Vector Machine (SVM) achieved the best performance (100 % and 98.8 % accuracy for calibration and for cross-validation), followed by Random Forest (RF, 100 % and 94.3 %), and k-Nearest Neighbor (kNN, 98.8 % and 88.8 %). To further validate robustness, additional 32 samples (i.e., 4 biological replicates for each type of PBMA) were prepared, analyzed and identified with these models. SVM achieved an accuracy of 100 %, followed by RF (96.9 %) and kNN (90.6 %). RF yielded comparable accuracy with respect to SVM, but offered further information about features contributing substantially to classification. Hence, RF led to the identification of the top 30 most relevant volatile metabolites. A simplified RF model, constructed using only these 30 features, achieved a calibration accuracy of 100 %, cross-validation accuracy of 96.5 %, and validation accuracy of 96.9 %, indicating a great potential for these 30 metabolic features to be used as markers for (targeted) authentication. Harnessing the power of the non-targeted HS-SPME-GC-MS and machine learning, a highly accurate and reliable workflow for the authentication of PBMA was established. This method is reliable for the authentication of PBMA, ensures the integrity of the products, and can protect the health of consumers and the economy of this emerging area.
对植物基牛奶替代品(PBMA)的需求大幅增长,尤其是在对动物乳制品过敏和/或不耐受的消费者以及关注环境可持续性的消费者中。随着市场扩张和生产成本上升,涉及PBMA的欺诈活动备受关注。为了验证PBMA产品的真实性,开发并优化了一种顶空固相微萃取气相色谱 - 质谱法(HS-SPME-GC-MS),以根据其挥发性代谢谱(即挥发组)区分8种类型的PBMA(即杏仁、腰果、榛子、核桃、燕麦、花生、开心果和澳洲坚果)。使用HS-SPME-GC-MS分析了总共80个样品(即每种类型的PBMA有10个重复样品),并使用机器学习算法进行数据预处理和分类模型构建。基于MS-DIAL数据库(版本:4.9.221218)鉴定出约143种挥发性化合物。测试了三种机器学习算法,其中支持向量机(SVM)表现最佳(校准和交叉验证的准确率分别为100%和98.8%),其次是随机森林(RF,100%和94.3%),以及k近邻(kNN,98.8%和88.8%)。为了进一步验证稳健性,又制备了另外32个样品(即每种类型的PBMA有4个生物学重复样品),并用这些模型进行分析和鉴定。SVM的准确率为100%,其次是RF(96.9%)和kNN(90.6%)。RF与SVM的准确率相当,但提供了对分类有重大贡献的特征的更多信息。因此,RF导致鉴定出前30种最相关的挥发性代谢物。仅使用这30个特征构建的简化RF模型,校准准确率为100%,交叉验证准确率为96.5%,验证准确率为96.9%,表明这30种代谢特征作为(靶向)鉴定标记具有很大潜力。利用非靶向HS-SPME-GC-MS和机器学习的力量,建立了一种用于PBMA鉴定的高度准确和可靠的工作流程。该方法对PBMA的鉴定可靠,确保了产品的完整性,并能保护消费者健康和这个新兴领域的经济。