Wang Liqi, Wu Dandan, Liu Chang, Zhang Hairong, Fan Zhipeng, Liu Feng, Yu Dianyu
School of Computer and Information Engineering, Harbin University of Commerce, Harbin 150028, China; College of Food Engineering, Harbin University of Commerce, Harbin 150028, China.
School of Computer and Information Engineering, Harbin University of Commerce, Harbin 150028, China.
Int J Biol Macromol. 2025 Jun;313:144286. doi: 10.1016/j.ijbiomac.2025.144286. Epub 2025 May 17.
The contents and ratios of 7S and 11S globulins are crucial for the nutritional value and functional properties of soybean proteins. Typically sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) is used to detect 7S and 11S globulin in soybeans however this method involves slow analysis procedures and high costs. Near-infrared (NIR) spectroscopy technology has emerged for detecting soybean protein content, enabling rapid non-destructive testing with advantages of convenient measurement, minimal sample processing requirements, and simultaneous determination of multiple components. To resolve the issue of shared quantitative prediction models between NIR spectroscopy-based 7S and 11S protein content predictions for various soybean seed and soybean powders, a transfer method of standard-free model based on transfer learning (TL) was proposed. Firstly, the NIR data of different forms of soybean samples were collected, and the near-infrared prediction models of 7S and 11S protein content were established. Secondly, the direct standardization (DS) and piecewise direct standardization (PDS) algorithms were improved to propose a DS-PDS-based model transfer method, with the influence of the sequence of preprocessing and model transfer algorithm on overall model transfer scheme was explored. Then, IRM is used to force the model to learn invariant features with causal relationship with labels by constraining the optimal classifier consistency of the model in different environments. Finally, aiming at standard sample sets corresponding to master-slave spectra required by traditional model transfer methods, the model transfer effect was investigated using a model transfer method based on standard-free migration learning. Results showed that the model transfer method based on without standard transfer learning was more suitable for 7S and 11S globulin content modeling between soybean seeds and soybean powders. It is intended to provide efficient and accurate 7S and 11S protein content detection methods for soybean processing enterprises and support quality control of soybean protein products and production of functional products.
7S和11S球蛋白的含量及比例对大豆蛋白的营养价值和功能特性至关重要。通常采用十二烷基硫酸钠-聚丙烯酰胺凝胶电泳(SDS-PAGE)检测大豆中的7S和11S球蛋白,但该方法分析过程缓慢且成本高昂。近红外(NIR)光谱技术已用于检测大豆蛋白含量,能够进行快速无损检测,具有测量便捷、样品处理要求低以及可同时测定多种成分的优点。为解决基于近红外光谱的不同大豆种子和大豆粉7S和11S蛋白含量预测共享定量预测模型的问题,提出了一种基于迁移学习(TL)的无标准模型迁移方法。首先,收集不同形态大豆样品的近红外数据,建立7S和11S蛋白含量的近红外预测模型。其次,对直接标准化(DS)和分段直接标准化(PDS)算法进行改进,提出基于DS-PDS的模型迁移方法,探讨预处理顺序和模型迁移算法对整体模型迁移方案的影响。然后,使用IRM通过约束模型在不同环境下的最优分类器一致性,迫使模型学习与标签具有因果关系的不变特征。最后,针对传统模型迁移方法所需的主-从光谱对应的标准样本集,采用基于无标准迁移学习的模型迁移方法研究模型迁移效果。结果表明,基于无标准迁移学习的模型迁移方法更适用于大豆种子和大豆粉之间7S和11S球蛋白含量建模。旨在为大豆加工企业提供高效准确的7S和11S蛋白含量检测方法,支持大豆蛋白产品质量控制和功能性产品生产。