Hamelin D J, Scicluna M, Saadie I, Mostefai F, Grenier J C, Baron C, Caron E, Hussin J G
Montreal Heart Institute, Université de Montréal, Montréal, Quebec, Canada.
Mila - Quebec AI Institute, Montréal, Quebec, Canada.
Comput Struct Biotechnol J. 2025 Mar 28;27:1370-1382. doi: 10.1016/j.csbj.2025.03.044. eCollection 2025.
The genomic diversification of viral pathogens during viral epidemics and pandemics represents a major adaptive route for infectious agents to circumvent therapeutic and public health initiatives. Historically, strategies to address viral evolution have relied on responding to emerging variants after their detection, leading to delays in effective public health responses. Because of this, a long-standing yet challenging objective has been to forecast viral evolution by predicting potentially harmful viral mutations prior to their emergence. The promises of artificial intelligence (AI) coupled with the exponential growth of viral data collection infrastructures spurred by the COVID-19 pandemic, have resulted in a research ecosystem highly conducive to this objective. Due to the COVID-19 pandemic accelerating the development of pandemic mitigation and preparedness strategies, many of the methods discussed here were designed in the context of SARS-CoV-2 evolution. However, most of these pipelines were intentionally designed to be adaptable across RNA viruses, with several strategies already applied to multiple viral species. In this review, we explore recent breakthroughs that have facilitated the forecasting of viral evolution in the context of an ongoing pandemic, with particular emphasis on deep learning architectures, including the promising potential of language models (LM). The approaches discussed here employ strategies that leverage genomic, epidemiologic, immunologic and biological information.
病毒病原体在病毒流行和大流行期间的基因组多样化是感染因子规避治疗和公共卫生措施的主要适应性途径。从历史上看,应对病毒进化的策略一直依赖于在检测到新出现的变异体后对其做出反应,这导致有效的公共卫生应对措施出现延迟。因此,一个长期存在但具有挑战性的目标是通过在潜在有害病毒突变出现之前进行预测来预测病毒进化。人工智能(AI)的前景,再加上由COVID-19大流行推动的病毒数据收集基础设施呈指数级增长,形成了一个非常有利于实现这一目标的研究生态系统。由于COVID-19大流行加速了大流行缓解和防范策略的发展,这里讨论的许多方法都是在SARS-CoV-2进化的背景下设计的。然而,这些流程大多是特意设计为可适用于各种RNA病毒的,已有几种策略应用于多种病毒物种。在这篇综述中,我们探讨了在持续的大流行背景下促进病毒进化预测的最新突破,特别强调深度学习架构,包括语言模型(LM)的广阔潜力。这里讨论的方法采用了利用基因组、流行病学、免疫学和生物学信息的策略。