Suppr超能文献

可解释机器学习在绵羊和山羊养殖场批量储存牛奶质量预测中的应用

The Use of Explainable Machine Learning for the Prediction of the Quality of Bulk-Tank Milk in Sheep and Goat Farms.

作者信息

Lianou Daphne T, Kiouvrekis Yiannis, Michael Charalambia K, Vasileiou Natalia G C, Psomadakis Ioannis, Politis Antonis P, Katsafadou Angeliki I, Katsarou Eleni I, Bourganou Maria V, Liagka Dimitra V, Chatzopoulos Dimitrios C, Solomakos Nikolaos M, Fthenakis George C

机构信息

Veterinary Faculty, University of Thessaly, 43100 Karditsa, Greece.

Faculty of Public and One Health, University of Thessaly, 43100 Karditsa, Greece.

出版信息

Foods. 2024 Dec 12;13(24):4015. doi: 10.3390/foods13244015.

Abstract

The specific objective of the present study was to develop computational models, by means of which predictions could be performed regarding the quality of the bulk-tank milk in dairy sheep and goat farms. Our hypothesis was that use of specific variables related to the health management applied in the farm can facilitate the development of predictions regarding values related to milk quality, specifically for fat content, protein content, fat and protein content combined, somatic cell counts, and total bacterial counts. Bulk-tank milk from 325 sheep and 119 goat farms was collected and evaluated by established techniques for analysis of fat and protein content, for somatic cell counting, and for total bacterial counting. Subsequently, computational models were constructed for the prediction of five target values: (a) fat content, (b) protein content, (c) fat and protein, (d) somatic cell counts, and (e) total bacterial counts, through the use of 21 independent variables related to factors prevalent in the farm. Five machine learning tools were employed: decision trees (18 different models evaluated), random forests (16 models), XGBoost (240 models), k-nearest neighbours (72 models), and neural networks (576 models) (in total, 9220 evaluations were performed). Tools found with the lowest mean absolute percentage error (MAPE) between the five tools used to test predictions for each target value were selected. In sheep farms, for the prediction of protein content, k-nearest neighbours was selected (MAPE: 3.95%); for the prediction of fat and protein content combined, neural networks was selected (6.00%); and for the prediction of somatic cell counts, random forests and k-nearest neighbours were selected (6.55%); no tool provided useful predictions for fat content and for total bacterial counts. In goat farms, for the prediction of protein content, k-nearest neighbours was selected (MAPE: 6.17%); for the prediction of somatic cell counts, random forests and k-nearest neighbours were selected (4.93% and 5.00%); and for the prediction of total bacterial counts, neural networks was selected (8.33%); no tool provided useful prediction models for fat content and for fat and protein content combined. The results of the study will be of interest to farmers, as well as to professionals; the findings will also be useful to dairy processing factories. That way, it will be possible to obtain a distance-aware, rapid, quantitative estimation of the milk output from sheep and goat farms with sufficient data attributes. It will thus become easier to monitor and improve milk quality at the farm level as part of the dairy production chain. Moreover, the findings can support the setup of relevant and appropriate measures and interventions in dairy sheep and goat farms.

摘要

本研究的具体目标是开发计算模型,通过这些模型可以对奶羊和奶山羊养殖场的储奶罐牛奶质量进行预测。我们的假设是,使用与养殖场应用的健康管理相关的特定变量,可以促进对与牛奶质量相关的值进行预测,特别是脂肪含量、蛋白质含量、脂肪和蛋白质含量总和、体细胞计数以及总细菌计数。收集了来自325个奶羊场和119个奶山羊场的储奶罐牛奶,并通过既定技术对脂肪和蛋白质含量、体细胞计数以及总细菌计数进行分析和评估。随后,通过使用与养殖场中普遍存在的因素相关的21个独立变量,构建了用于预测五个目标值的计算模型:(a)脂肪含量,(b)蛋白质含量,(c)脂肪和蛋白质,(d)体细胞计数,以及(e)总细菌计数。使用了五种机器学习工具:决策树(评估了18种不同模型)、随机森林(16种模型)、XGBoost(240种模型)、k近邻(72种模型)和神经网络(576种模型)(总共进行了9220次评估)。选择了在用于测试每个目标值预测的五种工具之间平均绝对百分比误差(MAPE)最低的工具。在奶羊场中,对于蛋白质含量的预测,选择了k近邻(MAPE:3.95%);对于脂肪和蛋白质含量总和的预测,选择了神经网络(6.00%);对于体细胞计数的预测,选择了随机森林和k近邻(6.55%);没有工具能对脂肪含量和总细菌计数提供有用的预测。在奶山羊场中,对于蛋白质含量的预测,选择了k近邻(MAPE:6.17%);对于体细胞计数的预测,选择了随机森林和k近邻(4.93%和5.00%);对于总细菌计数的预测,选择了神经网络(8.33%);没有工具能对脂肪含量和脂肪与蛋白质含量总和提供有用的预测模型。该研究结果将引起农民以及专业人士的兴趣;这些发现对乳制品加工厂也将是有用的。这样,就有可能利用足够的数据属性对奶羊和奶山羊养殖场的牛奶产量进行距离感知、快速、定量的估计。因此,作为乳制品生产链的一部分,在农场层面监测和改善牛奶质量将变得更加容易。此外,这些发现可以支持在奶羊和奶山羊养殖场制定相关且适当的措施和干预措施。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/000d/11726918/62f506211c9a/foods-13-04015-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验