Suppr超能文献

利用基因组数据和机器学习预测宿主物种对流感病毒和冠状病毒的易感性:一项范围综述

Predicting host species susceptibility to influenza viruses and coronaviruses using genome data and machine learning: a scoping review.

作者信息

Alberts Famke, Berke Olaf, Rocha Leilani, Keay Sheila, Maboni Grazieli, Poljak Zvonimir

机构信息

Department of Population Medicine, Ontario Veterinary College, University of Guelph, Guelph, ON, Canada.

Centre for Public Health and Zoonoses, University of Guelph, Guelph, ON, Canada.

出版信息

Front Vet Sci. 2024 Sep 25;11:1358028. doi: 10.3389/fvets.2024.1358028. eCollection 2024.

Abstract

INTRODUCTION

Predicting which species are susceptible to viruses (i.e., host range) is important for understanding and developing effective strategies to control viral outbreaks in both humans and animals. The use of machine learning and bioinformatic approaches to predict viral hosts has been expanded with advancements in techniques. We conducted a scoping review to identify the breadth of machine learning methods applied to influenza and coronavirus genome data for the identification of susceptible host species.

METHODS

The protocol for this scoping review is available at https://hdl.handle.net/10214/26112. Five online databases were searched, and 1,217 citations, published between January 2000 and May 2022, were obtained, and screened in duplicate for English language and research, covering the use of machine learning to identify susceptible species to viruses.

RESULTS

Fifty-three relevant publications were identified for data charting. The breadth of research was extensive including 32 different machine learning algorithms used in combination with 29 different feature selection methods and 43 different genome data input formats. There were 20 different methods used by authors to assess accuracy. Authors mostly used influenza viruses ( = 31/53 publications, 58.5%), however, more recent publications focused on coronaviruses and other viruses in combination with influenza viruses ( = 22/53, 41.5%). The susceptible animal groups authors most used were humans ( = 57/77 analyses, 74.0%), avian ( = 35/77 45.4%), and swine ( = 28/77, 36.4%). In total, 53 different hosts were used and, in most publications, data from multiple hosts was used.

DISCUSSION

The main gaps in research were a lack of standardized reporting of methodology and the use of broad host categories for classification. Overall, approaches to viral host identification using machine learning were diverse and extensive.

摘要

引言

预测哪些物种易感染病毒(即宿主范围)对于理解和制定控制人类和动物病毒爆发的有效策略至关重要。随着技术的进步,利用机器学习和生物信息学方法预测病毒宿主的应用得到了扩展。我们进行了一项范围综述,以确定应用于流感和冠状病毒基因组数据以识别易感宿主物种的机器学习方法的广度。

方法

本范围综述的方案可在https://hdl.handle.net/10214/26112获取。检索了五个在线数据库,获得了2000年1月至2022年5月发表的1217篇引文,并对其进行了重复筛选,以确保语言为英语且属于研究范畴,涵盖了使用机器学习识别病毒易感物种的内容。

结果

确定了53篇相关出版物用于数据图表绘制。研究范围广泛,包括32种不同的机器学习算法与29种不同的特征选择方法以及43种不同的基因组数据输入格式相结合。作者使用了20种不同的方法来评估准确性。作者大多使用流感病毒(31/53篇出版物,58.5%),然而,最近的出版物则侧重于冠状病毒和其他病毒与流感病毒的联合研究(22/53篇,41.5%)。作者最常使用的易感动物群体是人类(57/77项分析,74.0%)、禽类(35/77项,45.4%)和猪(28/77项,36.4%)。总共使用了53种不同的宿主,并且在大多数出版物中使用了来自多个宿主的数据。

讨论

研究中的主要差距在于缺乏方法学的标准化报告以及使用宽泛的宿主类别进行分类。总体而言,使用机器学习进行病毒宿主识别的方法多样且广泛。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/86db/11462629/7cf3e7d2e20f/fvets-11-1358028-g003.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验