Suppr超能文献

评估特定牛群的长短期记忆模型的性能,以识别与早期泌乳奶牛酮病诊断相关的自动健康警报。

Evaluating the performance of herd-specific long short-term memory models to identify automated health alerts associated with a ketosis diagnosis in early-lactation cows.

作者信息

Taechachokevivat N, Kou B, Zhang T, Montes M E, Boerman J P, Doucette J S, Neves R C

机构信息

Department of Veterinary Clinical Sciences, Purdue University, West Lafayette, IN 47907.

Department of Computer Science, Purdue University, West Lafayette, IN 47907.

出版信息

J Dairy Sci. 2024 Dec;107(12):11489-11501. doi: 10.3168/jds.2023-24513. Epub 2024 Sep 7.

Abstract

The growing use of automated systems in the dairy industry generates a vast amount of cow-level data daily, creating opportunities for using these data to support real-time decision-making. Currently, various commercial systems offer built-in alert algorithms to identify cows requiring attention. To our knowledge, no work has been done to compare the use of models accounting for herd-level variability on their predictive ability against automated systems. Long short-term memory (LSTM) models are machine learning models capable of learning temporal patterns and making predictions based on time series data. The objective of our study was to evaluate the ability of LSTM models to identify a health alert associated with a ketosis diagnosis (HAK) using deviations of daily milk yield, milk fat-to-protein ratio (FPR), number of successful milkings, rumination time, and activity index from the herd median by parity and DIM, considering various time series lengths and numbers of days before HAK. Additionally, we aimed to use Explainable Artificial Intelligence method to understand the relationships between input variables and model outputs. Data on daily milk yield, milk FPR, number of successful milkings, rumination time, activity, and health events during 0 to 21 DIM were retrospectively obtained from a commercial Holstein dairy farm in northern Indiana from February 2020 to January 2023. A total of 1,743 cows were included in the analysis (non-HAK = 1,550; HAK = 193). Variables were transformed based on deviations from the herd median by parity and DIM. Six LSTM models were developed to identify HAK 1, 2, and 3 d before farm diagnosis using historic cow-level data with varying time series lengths. Model performance was assessed using repeated stratified 10-fold cross-validation for 20 repeats. The Shapley additive explanations framework (SHAP) was used for model explanation. Model accuracy was 83%, 74%, and 70%; balanced error rate was 17% to 18%, 26% to 28%, and 34%; sensitivity was 81% to 83%, 71% to 74%, and 62%; specificity was 83%, 74%, and 71%; positive predictive value was 38%, 25% to 27%, and 21%; negative predictive value was 97% to 98%, 95% to 96%, and 94%; and area under the curve was 0.89 to 0.90, 0.80 to 0.81, and 0.72 for models identifying HAK 1, 2, and 3 d before diagnosis, respectively. Performance declined as the time interval between identification and farm diagnosis increased, and extending the time series length did not improve model performance. Model explanation revealed that cows with lower milk yield, number of successful milkings, rumination time, and activity, and higher milk FPR compared with herdmates of the same parity and DIM were more likely to be classified as HAK. Our results demonstrate the potential of LSTM models in identifying HAK using deviations of daily milk production variables, rumination time, and activity index from the herd median by parity and DIM. Future studies are needed to evaluate the performance of health alerts using LSTM models controlling for herd-specific metrics against commercial built-in algorithms in multiple farms and for other disorders.

摘要

乳制品行业中自动化系统的使用日益增加,每天都会产生大量奶牛层面的数据,这为利用这些数据支持实时决策创造了机会。目前,各种商业系统都提供内置的警报算法,以识别需要关注的奶牛。据我们所知,尚未开展任何工作来比较考虑牛群层面变异性的模型与自动化系统在预测能力方面的使用情况。长短期记忆(LSTM)模型是一种机器学习模型,能够学习时间模式并基于时间序列数据进行预测。我们研究的目的是评估LSTM模型使用日产奶量、乳脂与蛋白质比率(FPR)、成功挤奶次数、反刍时间以及按胎次和产犊间隔天数计算的活动指数与牛群中位数的偏差来识别与酮病诊断相关的健康警报(HAK)的能力,同时考虑各种时间序列长度以及HAK前的天数。此外,我们旨在使用可解释人工智能方法来理解输入变量与模型输出之间的关系。从2020年2月至2023年1月,我们回顾性地从印第安纳州北部的一个商业荷斯坦奶牛场获取了0至21天产犊间隔期内的日产奶量、牛奶FPR、成功挤奶次数、反刍时间、活动以及健康事件的数据。共有1743头奶牛纳入分析(非HAK = 1550头;HAK = 193头)。变量基于按胎次和产犊间隔天数计算的与牛群中位数的偏差进行转换。我们开发了六个LSTM模型,使用具有不同时间序列长度的历史奶牛层面数据来识别农场诊断前1、2和3天的HAK。模型性能通过20次重复的重复分层10折交叉验证进行评估。使用Shapley加性解释框架(SHAP)进行模型解释。模型准确率分别为83%、74%和70%;平衡错误率为17%至18%、26%至28%和34%;灵敏度为81%至83%、71%至74%和62%;特异性为83%、74%和71%;阳性预测值为38%、25%至27%和21%;阴性预测值为97%至98%、95%至96%和94%;对于识别诊断前1、2和3天HAK的模型,曲线下面积分别为0.89至0.90、0.80至0.81和0.72。随着识别与农场诊断之间的时间间隔增加,性能下降,并且延长时间序列长度并未提高模型性能。模型解释表明,与相同胎次和产犊间隔天数的牛群同伴相比,产奶量、成功挤奶次数、反刍时间和活动较低且牛奶FPR较高的奶牛更有可能被归类为HAK。我们的结果证明了LSTM模型利用按胎次和产犊间隔天数计算的日产奶量变量、反刍时间和活动指数与牛群中位数的偏差来识别HAK的潜力。未来需要开展研究,以评估在多个农场中使用控制牛群特定指标的LSTM模型与商业内置算法相比,针对其他疾病的健康警报性能。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验