Balakrishna Yusentha, Manda Samuel, Mwambi Henry, van Graan Averalda
Biostatistics Research Unit, South African Medical Research Council, Durban, South Africa.
School of Mathematics, Statistics and Computer Science, University of KwaZulu-Natal, Pietermaritzburg, South Africa.
Front Nutr. 2023 Oct 13;10:1186221. doi: 10.3389/fnut.2023.1186221. eCollection 2023.
The identification of classes of nutritionally similar food items is important for creating food exchange lists to meet health requirements and for informing nutrition guidelines and campaigns. Cluster analysis methods can assign food items into classes based on the similarity in their nutrient contents. Finite mixture models use probabilistic classification with the advantage of taking into account the uncertainty of class thresholds.
This paper uses univariate Gaussian mixture models to determine the probabilistic classification of food items in the South African Food Composition Database (SAFCDB) based on nutrient content.
Classifying food items by animal protein, fatty acid, available carbohydrate, total fibre, sodium, iron, vitamin A, thiamin and riboflavin contents produced data-driven classes with differing means and estimates of variability and could be clearly ranked on a low to high nutrient contents scale. Classifying food items by their sodium content resulted in five classes with the class means ranging from 1.57 to 706.27 mg per 100 g. Four classes were identified based on available carbohydrate content with the highest carbohydrate class having a mean content of 59.15 g per 100 g. Food items clustered into two classes when examining their fatty acid content. Foods with a high iron content had a mean of 1.46 mg per 100 g and was one of three classes identified for iron. Classes containing nutrient-rich food items that exhibited extreme nutrient values were also identified for several vitamins and minerals.
The overlap between classes was evident and supports the use of probabilistic classification methods. Food items in each of the identified classes were comparable to allowed food lists developed for therapeutic diets. This data-driven ranking of nutritionally similar classes could be considered for diet planning for medical conditions and individuals with dietary restrictions.
识别营养成分相似的食品类别对于创建满足健康需求的食物交换列表以及为营养指南和宣传活动提供信息非常重要。聚类分析方法可以根据食品营养成分的相似性将其分类。有限混合模型使用概率分类,其优点是考虑了类别阈值的不确定性。
本文使用单变量高斯混合模型,根据营养成分确定南非食物成分数据库(SAFCDB)中食品的概率分类。
按动物蛋白、脂肪酸、可利用碳水化合物、总纤维、钠、铁、维生素A、硫胺素和核黄素含量对食品进行分类,得出了数据驱动的类别,这些类别具有不同的均值和变异性估计,并且可以在低到高营养含量的尺度上进行清晰排序。按钠含量对食品进行分类,得到了五个类别,每100克的类别均值范围为1.57至706.27毫克。根据可利用碳水化合物含量确定了四个类别,碳水化合物含量最高的类别平均含量为每100克59.15克。在检查脂肪酸含量时,食品聚为两类。高铁含量的食品平均每100克含1.46毫克铁,是确定的铁的三个类别之一。还为几种维生素和矿物质确定了包含具有极端营养值的营养丰富食品的类别。
类别之间的重叠很明显,这支持了概率分类方法的使用。每个已识别类别中的食品与为治疗性饮食制定的允许食物清单相当。这种基于数据的营养相似类别的排序可用于医疗状况和有饮食限制的个人的饮食计划。