Suppr超能文献

比较两种机器学习预测模型在支持英格兰牛结核病控制方面的价值。

A comparison of the value of two machine learning predictive models to support bovine tuberculosis disease control in England.

机构信息

Animal and Plant Health Agency, Woodham Lane, Addlestone, Surrey, KT15 3NB, United Kingdom; Royal Veterinary College, Hawkshead Lane, North Mymms, Hatfield, Hertfordshire, AL9 7TA, United Kingdom.

Royal Veterinary College, Hawkshead Lane, North Mymms, Hatfield, Hertfordshire, AL9 7TA, United Kingdom.

出版信息

Prev Vet Med. 2021 Mar;188:105264. doi: 10.1016/j.prevetmed.2021.105264. Epub 2021 Jan 15.

Abstract

Nearly a decade into Defra's current eradication strategy, bovine tuberculosis (bTB) remains a serious animal health problem in England, with c.30,000 cattle slaughtered annually in the fight against this insidious disease. There is an urgent need to improve our understanding of bTB risk in order to enhance the current disease control policy. Machine learning approaches applied to big datasets offer a potential way to do this. Regularized regression and random forest machine learning methodologies were implemented using 2016 herd-level data to generate the best possible predictive models for a bTB incident in England and its three surveillance risk areas (High-risk area [HRA], Edge area [EA] and Low-risk area [LRA]). Their predictive performance was compared and the best models in each area were used to characterize herds according to risk. While all models provided excellent discrimination, random forest models achieved the highest balanced accuracy (i.e. average of sensitivity and specificity) in England, HRA and LRA, whereas the regularized regression LASSO model did so in the EA. The time since the last confirmed incident was resolved was the only variable in the top-ten ranking in all areas according to both types of models, which highlights the importance of bTB history as a predictor of a new incident. Risk categorisation based on Receiver Operating Characteristic (ROC) analysis was carried out using the best predictive models in each area setting a 99 % threshold value for sensitivity and specificity (97 % in the LRA). Thirteen percent of herds in the whole of England as well as in its HRA, 14 % in its EA and 31 % in its LRA were classified as high-risk. These could be selected for the deployment of additional disease control measures at national or area level. In this way, low-risk herds within the area considered would not be penalised unnecessarily by blanket control measures and limited resources be used more efficiently. The methodology presented in this paper demonstrates a way to accurately identify high-risk farms to inform a targeted disease control and prevention strategy in England that supplements existing population strategies.

摘要

在英国,尽管英国环境、食品和农村事务部(Defra)目前的根除策略已经实施了近十年,但牛结核病(bTB)仍然是一个严重的动物健康问题,每年约有 3 万头牛因这种潜伏性疾病而被宰杀。为了增强当前的疾病控制政策,迫切需要提高我们对 bTB 风险的认识。应用于大数据集的机器学习方法为此提供了一种潜在的方法。使用 2016 年的畜群水平数据实施了正则化回归和随机森林机器学习方法,为英国及其三个监测风险区(高风险区[HRA]、边缘区[EA]和低风险区[LRA])的 bTB 事件生成了最佳预测模型。比较了它们的预测性能,并在每个区域使用最佳模型根据风险对畜群进行特征描述。虽然所有模型都提供了出色的区分能力,但随机森林模型在英国、HRA 和 LRA 中的平衡准确率(即敏感性和特异性的平均值)最高,而正则化回归 LASSO 模型在 EA 中的准确率最高。根据这两种类型的模型,在所有区域中,排名前十的变量中唯一的变量是自上次确认事件以来的时间,这突出了 bTB 历史作为新事件预测指标的重要性。使用每个区域的最佳预测模型根据接收者操作特征(ROC)分析进行风险分类,为敏感性和特异性设定 99%的阈值(LRA 中为 97%)。英格兰及其 HRA 的 13%的畜群、EA 的 14%和 LRA 的 31%被归类为高风险。这些畜群可以在国家或地区层面上选择部署额外的疾病控制措施。这样,在考虑的区域内,低风险畜群就不会因全面控制措施而受到不必要的惩罚,有限的资源就可以更有效地利用。本文提出的方法展示了一种准确识别高风险农场的方法,以补充现有的群体策略,为英格兰制定有针对性的疾病控制和预防策略提供信息。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验