Nguyen Hoang-Hai, Rudar Josip, Lesperance Nathaniel, Vernygora Oksana, Taylor Graham W, Laing Chad, Lapen David, Leung Carson K, Lung Oliver
National Centre for Foreign Animal Disease, Canadian Food Inspection Agency, Winnipeg, Manitoba R3E 3M4, Canada.
Department of Computer Science, University of Manitoba, Winnipeg, Manitoba R3T 2N2, Canada.
Gigascience. 2025 Jan 6;14. doi: 10.1093/gigascience/giaf089.
Influenza A virus (IAV) poses a significant threat to animal health globally, with its ability to overcome species barriers and cause pandemics. Rapid and accurate IAV subtypes and host source prediction is crucial for effective surveillance and pandemic preparedness. Deep learning has emerged as a powerful tool for analyzing viral genomic sequences, offering new ways to uncover hidden patterns associated with viral characteristics and host adaptation.
We introduce WaveSeekerNet, a novel deep learning model for accurate and rapid prediction of IAV subtypes and host source. The model leverages attention-based mechanisms and efficient token mixing schemes, including the Fourier Transform and the Wavelet Transform, to capture intricate patterns within viral RNA and protein sequences. Extensive experiments on diverse datasets demonstrate WaveSeekerNet's superior performance to existing models that use the traditional self-attention mechanism. Notably, WaveSeekerNet rivals VADR (Viral Annotation DefineR) in subtype prediction using the high-quality RNA sequences, achieving the maximum score of 1.0 on metrics, including the Balanced Accuracy, F1-score (Macro Average), and Matthews Correlation Coefficient. Our approach to subtype and host source prediction also exceeds the pretrained ESM-2 (Evolutionary Scale Modeling) models with respect to generalization performance and computational cost. Furthermore, WaveSeekerNet exhibits remarkable accuracy in distinguishing between human, avian, and other mammalian hosts. The ability of WaveSeekerNet to flag potential cross-species transmission events underscores its significant value for real-time surveillance and proactive pandemic preparedness efforts.
WaveSeekerNet's superior performance, efficiency, and ability to flag potential cross-species transmission events highlight its potential for real-time surveillance and pandemic preparedness. This model represents a significant advancement in applying deep learning for IAV classification and holds promise for future epidemiological, veterinary studies, and public health interventions.
甲型流感病毒(IAV)凭借其跨越物种屏障并引发大流行的能力,对全球动物健康构成重大威胁。快速准确地预测IAV亚型和宿主来源对于有效的监测和大流行防范至关重要。深度学习已成为分析病毒基因组序列的强大工具,为揭示与病毒特征和宿主适应性相关的隐藏模式提供了新方法。
我们引入了WaveSeekerNet,这是一种用于准确快速预测IAV亚型和宿主来源的新型深度学习模型。该模型利用基于注意力的机制和高效的令牌混合方案,包括傅里叶变换和小波变换,来捕捉病毒RNA和蛋白质序列中的复杂模式。在不同数据集上进行的广泛实验表明,WaveSeekerNet的性能优于使用传统自注意力机制的现有模型。值得注意的是,在使用高质量RNA序列进行亚型预测时,WaveSeekerNet与VADR(病毒注释定义器)相当,在包括平衡准确率、F1分数(宏平均)和马修斯相关系数等指标上达到了1.0的最高分。我们的亚型和宿主来源预测方法在泛化性能和计算成本方面也超过了预训练的ESM-2(进化尺度建模)模型。此外,WaveSeekerNet在区分人类、禽类和其他哺乳动物宿主方面表现出显著的准确性。WaveSeekerNet标记潜在跨物种传播事件的能力突出了其在实时监测和积极的大流行防范工作中的重要价值。
WaveSeekerNet的卓越性能、效率以及标记潜在跨物种传播事件的能力突出了其在实时监测和大流行防范方面的潜力。该模型代表了将深度学习应用于IAV分类的重大进展,并为未来的流行病学、兽医研究和公共卫生干预带来了希望。