Suppr超能文献

机器学习早期检测新冠病毒高风险变异株

Machine Learning Early Detection of SARS-CoV-2 High-Risk Variants.

作者信息

Li Lun, Li Cuiping, Li Na, Zou Dong, Zhao Wenming, Luo Hong, Xue Yongbiao, Zhang Zhang, Bao Yiming, Song Shuhui

机构信息

China National Center for Bioinformation, Beijing, 100101, China.

National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, 100101, China.

出版信息

Adv Sci (Weinh). 2024 Dec;11(45):e2405058. doi: 10.1002/advs.202405058. Epub 2024 Oct 14.

Abstract

The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has evolved many high-risk variants, resulting in repeated COVID-19 waves over the past years. Therefore, accurate early warning of high-risk variants is vital for epidemic prevention and control. However, detecting high-risk variants through experimental and epidemiological research is time-consuming and often lags behind the emergence and spread of these variants. In this study, HiRisk-Detector a machine learning algorithm based on haplotype network, is developed for computationally early detecting high-risk SARS-CoV-2 variants. Leveraging over 7.6 million high-quality and complete SARS-CoV-2 genomes and metadata, the effectiveness, robustness, and generalizability of HiRisk-Detector are validated. First, HiRisk-Detector is evaluated on actual empirical data, successfully detecting all 13 high-risk variants, preceding World Health Organization announcements by 27 days on average. Second, its robustness is tested by reducing sequencing intensity to one-fourth, noting only a minimal delay of 3.8 days, demonstrating its effectiveness. Third, HiRisk-Detector is applied to detect risks among SARS-CoV-2 Omicron variant sub-lineages, confirming its broad applicability and high ROC-AUC and PR-AUC performance. Overall, HiRisk-Detector features powerful capacity for early detection of high-risk variants, bearing great utility for any public emergency caused by infectious diseases or viruses.

摘要

严重急性呼吸综合征冠状病毒2(SARS-CoV-2)已经进化出许多高风险变异株,导致在过去几年中新冠疫情反复出现。因此,对高风险变异株进行准确的早期预警对于疫情防控至关重要。然而,通过实验和流行病学研究来检测高风险变异株既耗时,而且往往滞后于这些变异株的出现和传播。在本研究中,开发了基于单倍型网络的机器学习算法HiRisk-Detector,用于对高风险SARS-CoV-2变异株进行计算早期检测。利用超过760万个高质量和完整的SARS-CoV-2基因组及元数据,验证了HiRisk-Detector的有效性、稳健性和通用性。首先,在实际经验数据上对HiRisk-Detector进行评估,成功检测出所有13种高风险变异株,平均比世界卫生组织宣布提前27天。其次,通过将测序强度降低到四分之一来测试其稳健性,发现仅延迟3.8天,证明了其有效性。第三,将HiRisk-Detector应用于检测SARS-CoV-2奥密克戎变异株亚谱系中的风险,证实了其广泛的适用性以及较高的ROC-AUC和PR-AUC性能。总体而言,HiRisk-Detector具有强大的早期检测高风险变异株的能力,对于由传染病或病毒引起的任何公共紧急情况都具有很大的实用价值。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbf8/11615786/f6f5e5313244/ADVS-11-2405058-g003.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验