Wilmar International Limited, WIL@NUS Corporate Lab, Center for Translational Medicine, 14 Medical Drive, Singapore, Singapore.
Yihai Kerry Arawana Oils, Grains & Food Co., Ltd, Arawana Building, No. 1379 Bocheng Road, Pudong New District, Shanghai, China.
Nat Commun. 2020 Oct 23;11(1):5353. doi: 10.1038/s41467-020-19137-6.
Previous studies have shown that each edible oil type has its own characteristic fatty acid profile; however, no method has yet been described allowing the identification of oil types simply based on this characteristic. Moreover, the fatty acid profile of a specific oil type can be mimicked by a mixture of 2 or more oil types. This has led to fraudulent oil adulteration and intentional mislabeling of edible oils threatening food safety and endangering public health. Here, we present a machine learning method to uncover fatty acid patterns discriminative for ten different plant oil types and their intra-variability. We also describe a supervised end-to-end learning method that can be generalized to oil composition of any given mixtures. Trained on a large number of simulated oil mixtures, independent test dataset validation demonstrates that the model has a 50 percentile absolute error between 1.4-1.8% and a 90 percentile error of 4-5.4% for any 3-way mixtures of the ten oil types. The deep learning model can also be further refined with on-line training. Because oil-producing plants have diverse geographical origins and hence slightly varying fatty acid profiles, an online-training method provides also a way to capture useful knowledge presently unavailable. Our method allows the ability to control product quality, determining the fair price of purchased oils and in-turn allowing health-conscious consumers the future of accurate labeling.
先前的研究表明,每种食用油都有其独特的脂肪酸组成;然而,目前还没有一种方法可以仅根据这一特征来识别油的类型。此外,特定油类的脂肪酸组成可以通过两种或更多种油的混合物来模拟。这导致了食用油的欺诈性掺假和故意标签错误,威胁到食品安全,危及公众健康。在这里,我们提出了一种机器学习方法,可以揭示十种不同植物油及其变异性的脂肪酸模式。我们还描述了一种监督端到端的学习方法,该方法可以推广到任何给定混合物的油组成。在大量模拟油混合物上进行训练,独立测试数据集的验证表明,该模型对于十种油的任意三种混合物的 50%百分位绝对误差在 1.4-1.8%之间,90%百分位误差在 4-5.4%之间。该深度学习模型还可以通过在线训练进一步优化。由于产油植物的地理起源不同,因此脂肪酸的组成也略有不同,在线训练方法还提供了一种获取当前无法获取的有用知识的途径。我们的方法允许控制产品质量,确定购买油的公平价格,从而使有健康意识的消费者能够实现准确的标签。