Walden Emilee, Chen Jiahui, Wei Guo-Wei
Department of Mathematical Sciences, University of Arkansas, Fayetteville, AR 72701, USA.
Department of Mathematics, Michigan State University, MI 48824, USA.
ArXiv. 2025 Apr 4:arXiv:2504.03550v1.
Viral mutations pose significant threats to public health by increasing infectivity, strengthening vaccine resistance, and altering disease severity. To track these evolving patterns, agencies like the CDC annually evaluate thousands of virus strains, underscoring the urgent need to understand viral mutagenesis and evolution in depth. In this study, we integrate genomic analysis, clustering, and three leading dimensionality reduction approaches, namely, principal component analysis (PCA), t-distributed stochastic neighbor embedding (t-SNE), and uniform manifold approximation and projection (UMAP)-to investigate the effects of COVID-19 on influenza virus propagation. By applying these methods to extensive pre- and post-pandemic influenza datasets, we reveal how selective pressures during the pandemic have influenced the diversity of influenza genetics. Our findings indicate that combining robust dimension reduction with clustering yields critical insights into the complex dynamics of viral mutation, informing both future research directions and strategies for public health intervention.
病毒突变通过提高传染性、增强疫苗抗性和改变疾病严重程度对公众健康构成重大威胁。为了追踪这些不断演变的模式,像美国疾病控制与预防中心(CDC)这样的机构每年都会评估数千种病毒株,这凸显了深入了解病毒诱变和进化的迫切需求。在本研究中,我们整合了基因组分析、聚类以及三种领先的降维方法,即主成分分析(PCA)、t分布随机邻域嵌入(t-SNE)和均匀流形近似与投影(UMAP),以研究新冠疫情对流感病毒传播的影响。通过将这些方法应用于疫情前后的大量流感数据集,我们揭示了疫情期间的选择压力如何影响了流感遗传学的多样性。我们的研究结果表明,将强大的降维方法与聚类相结合能够深入洞察病毒突变的复杂动态,为未来的研究方向和公共卫生干预策略提供信息。