Herrera-Rocha Fabio, Fernández-Niño Miguel, Duitama Jorge, Cala Mónica P, Chica María José, Wessjohann Ludger A, Davari Mehdi D, Barrios Andrés Fernando González
Grupo de Diseño de Productos y Procesos (GDPP), Department of Chemical and Food Engineering, Universidad de los Andes, 111711, Bogotá, Colombia.
Leibniz-Institute of Plant Biochemistry, Department of Bioorganic Chemistry, Weinberg 3, 06120, Halle, Germany.
J Cheminform. 2024 Dec 10;16(1):140. doi: 10.1186/s13321-024-00935-9.
Flavor is the main factor driving consumers acceptance of food products. However, tracking the biochemistry of flavor is a formidable challenge due to the complexity of food composition. Current methodologies for linking individual molecules to flavor in foods and beverages are expensive and time-consuming. Predictive models based on machine learning (ML) are emerging as an alternative to speed up this process. Nonetheless, the optimal approach to predict flavor features of molecules remains elusive. In this work we present FlavorMiner, an ML-based multilabel flavor predictor. FlavorMiner seamlessly integrates different combinations of algorithms and mathematical representations, augmented with class balance strategies to address the inherent class of the input dataset. Notably, Random Forest and K-Nearest Neighbors combined with Extended Connectivity Fingerprint and RDKit molecular descriptors consistently outperform other combinations in most cases. Resampling strategies surpass weight balance methods in mitigating bias associated with class imbalance. FlavorMiner exhibits remarkable accuracy, with an average ROC AUC score of 0.88. This algorithm was used to analyze cocoa metabolomics data, unveiling its profound potential to help extract valuable insights from intricate food metabolomics data. FlavorMiner can be used for flavor mining in any food product, drawing from a diverse training dataset that spans over 934 distinct food products.Scientific Contribution FlavorMiner is an advanced machine learning (ML)-based tool designed to predict molecular flavor features with high accuracy and efficiency, addressing the complexity of food metabolomics. By leveraging robust algorithmic combinations paired with mathematical representations FlavorMiner achieves high predictive performance. Applied to cocoa metabolomics, FlavorMiner demonstrated its capacity to extract meaningful insights, showcasing its versatility for flavor analysis across diverse food products. This study underscores the transformative potential of ML in accelerating flavor biochemistry research, offering a scalable solution for the food and beverage industry.
风味是推动消费者接受食品的主要因素。然而,由于食品成分的复杂性,追踪风味的生物化学过程是一项艰巨的挑战。目前将食品和饮料中的单个分子与风味联系起来的方法既昂贵又耗时。基于机器学习(ML)的预测模型正在成为加速这一过程的替代方法。尽管如此,预测分子风味特征的最佳方法仍然难以捉摸。在这项工作中,我们展示了FlavorMiner,一种基于ML的多标签风味预测器。FlavorMiner无缝集成了算法和数学表示的不同组合,并通过类平衡策略进行增强,以解决输入数据集的固有类别问题。值得注意的是,在大多数情况下,随机森林和K近邻算法与扩展连接指纹和RDKit分子描述符相结合的表现始终优于其他组合。在减轻与类不平衡相关的偏差方面,重采样策略优于权重平衡方法。FlavorMiner表现出了卓越的准确性,平均ROC AUC评分为0.88。该算法被用于分析可可代谢组学数据,揭示了其从复杂的食品代谢组学数据中提取有价值见解的巨大潜力。FlavorMiner可用于任何食品的风味挖掘,其训练数据集涵盖了934种不同的食品。科学贡献FlavorMiner是一种先进的基于机器学习(ML)的工具,旨在高精度、高效率地预测分子风味特征,解决食品代谢组学的复杂性问题。通过利用强大的算法组合与数学表示,FlavorMiner实现了高预测性能。应用于可可代谢组学时,FlavorMiner展示了其提取有意义见解的能力,彰显了其在各种食品风味分析中的通用性。这项研究强调了ML在加速风味生物化学研究方面的变革潜力,为食品和饮料行业提供了一种可扩展的解决方案。