用于智能洲际 SARS-CoV-2 亚系特征描述和预测的混合计算框架。

A hybrid computational framework for intelligent inter-continent SARS-CoV-2 sub-strains characterization and prediction.

机构信息

Department of Computer Science, University of Uyo, P.M.B. 1017, Uyo, 520003, Nigeria.

Centre for Research and Development, University of Uyo, P.M.B. 1017, Uyo, 520003, Nigeria.

出版信息

Sci Rep. 2021 Jul 15;11(1):14558. doi: 10.1038/s41598-021-93757-w.

DOI:10.1038/s41598-021-93757-w

PMID:34267263

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8282786/

Abstract

Whereas accelerated attention beclouded early stages of the coronavirus spread, knowledge of actual pathogenicity and origin of possible sub-strains remained unclear. By harvesting the Global initiative on Sharing All Influenza Data (GISAID) database ( https://www.gisaid.org/ ), between December 2019 and January 15, 2021, a total of 8864 human SARS-CoV-2 complete genome sequences processed by gender, across 6 continents (88 countries) of the world, Antarctica exempt, were analyzed. We hypothesized that data speak for itself and can discern true and explainable patterns of the disease. Identical genome diversity and pattern correlates analysis performed using a hybrid of biotechnology and machine learning methods corroborate the emergence of inter- and intra- SARS-CoV-2 sub-strains transmission and sustain an increase in sub-strains within the various continents, with nucleotide mutations dynamically varying between individuals in close association with the virus as it adapts to its host/environment. Interestingly, some viral sub-strain patterns progressively transformed into new sub-strain clusters indicating varying amino acid, and strong nucleotide association derived from same lineage. A novel cognitive approach to knowledge mining helped the discovery of transmission routes and seamless contact tracing protocol. Our classification results were better than state-of-the-art methods, indicating a more robust system for predicting emerging or new viral sub-strain(s). The results therefore offer explanations for the growing concerns about the virus and its next wave(s). A future direction of this work is a defuzzification of confusable pattern clusters for precise intra-country SARS-CoV-2 sub-strains analytics.

摘要

尽管加速的注意力使冠状病毒传播的早期阶段变得模糊，但对实际致病性和可能的亚系起源的了解仍不清楚。通过利用全球流感数据共享倡议（GISAID）数据库（https://www.gisaid.org/），在 2019 年 12 月至 2021 年 1 月 15 日期间，分析了来自世界六大洲（南极洲除外的 88 个国家）的 8864 个人类 SARS-CoV-2 完整基因组序列，这些序列是按性别处理的。我们假设数据可以说明问题，并能够辨别疾病的真实和可解释模式。使用生物技术和机器学习方法的混合方法进行相同的基因组多样性和模式相关性分析，证实了 SARS-CoV-2 亚系的传播和在各大洲内的亚系数量增加，核苷酸突变在个体之间动态变化，与病毒适应宿主/环境的过程密切相关。有趣的是，一些病毒亚系模式逐渐演变成新的亚系簇，表明来自同一谱系的氨基酸和强烈的核苷酸存在差异。一种新的认知方法有助于发现传播途径和无缝的接触追踪协议。我们的分类结果优于最先进的方法，表明对于预测新出现或新的病毒亚系，我们的系统更加稳健。因此，结果为人们对病毒及其下一波疫情的日益关注提供了一些解释。这项工作的未来方向是对易混淆的模式簇进行去模糊化，以便对各国国内的 SARS-CoV-2 亚系进行更精确的分析。

相似文献

A hybrid computational framework for intelligent inter-continent SARS-CoV-2 sub-strains characterization and prediction.用于智能洲际 SARS-CoV-2 亚系特征描述和预测的混合计算框架。

Sci Rep. 2021 Jul 15;11(1):14558. doi: 10.1038/s41598-021-93757-w.

Sequencing Using a Two-Step Strategy Reveals High Genetic Diversity in the S Gene of SARS-CoV-2 after a High-Transmission Period in Tunis, Tunisia.两步策略测序揭示突尼斯高传播期后 SARS-CoV-2 S 基因的高遗传多样性。

Microbiol Spectr. 2021 Dec 22;9(3):e0063921. doi: 10.1128/Spectrum.00639-21. Epub 2021 Nov 10.

Emergence of novel SARS-CoV-2 variants in the Netherlands.荷兰出现新型严重急性呼吸综合征冠状病毒2（SARS-CoV-2）变种。

Sci Rep. 2021 Mar 23;11(1):6625. doi: 10.1038/s41598-021-85363-7.

Collaborative Mining of Whole Genome Sequences for Intelligent HIV-1 Sub-Strain(s) Discovery.全基因组序列的协作挖掘用于智能 HIV-1 亚型发现。

Curr HIV Res. 2022 Aug 12;20(2):163-183. doi: 10.2174/1570162X20666220210142209.

Unsupervised cluster analysis of SARS-CoV-2 genomes reflects its geographic progression and identifies distinct genetic subgroups of SARS-CoV-2 virus.对 SARS-CoV-2 基因组进行无监督聚类分析反映了其地理进展，并确定了 SARS-CoV-2 病毒的不同遗传亚群。

Genet Epidemiol. 2021 Apr;45(3):316-323. doi: 10.1002/gepi.22373. Epub 2021 Jan 8.

SARS-CoV-2 lineage B.6 was the major contributor to early pandemic transmission in Malaysia.新冠病毒变异株 B.6 是马来西亚早期大流行传播的主要贡献者。

PLoS Negl Trop Dis. 2020 Nov 30;14(11):e0008744. doi: 10.1371/journal.pntd.0008744. eCollection 2020 Nov.

Assessment and classification of COVID-19 DNA sequence using pairwise features concatenation from multi-transformer and deep features with machine learning models.使用来自多变压器的成对特征串联和机器学习模型的深度特征对新冠病毒DNA序列进行评估和分类。

SLAS Technol. 2024 Aug;29(4):100147. doi: 10.1016/j.slast.2024.100147. Epub 2024 May 23.

SARS Coronavirus-2 variant tracing within the first Coronavirus Disease 19 clusters in northern Germany.德国北部首批 2019 年冠状病毒病集群内的严重急性呼吸综合征冠状病毒 2 变异株溯源。

Clin Microbiol Infect. 2021 Jan;27(1):130.e5-130.e8. doi: 10.1016/j.cmi.2020.09.034. Epub 2020 Sep 29.

Topological Analysis for Sequence Variability: Case Study on more than 2K SARS-CoV-2 sequences of COVID-19 infected 54 countries in comparison with SARS-CoV-1 and MERS-CoV.拓扑分析用于序列变异：以 54 个国家 2000 多个 COVID-19 感染 SARS-CoV-2 序列为例，与 SARS-CoV-1 和 MERS-CoV 进行比较。

Infect Genet Evol. 2021 Mar;88:104708. doi: 10.1016/j.meegid.2021.104708. Epub 2021 Jan 6.

Haplotype distribution of SARS-CoV-2 variants in low and high vaccination rate countries during ongoing global COVID-19 pandemic in early 2021.2021 年初全球 COVID-19 大流行期间，低和高疫苗接种率国家中 SARS-CoV-2 变体的单倍型分布。

Infect Genet Evol. 2022 Jan;97:105164. doi: 10.1016/j.meegid.2021.105164. Epub 2021 Nov 27.

引用本文的文献

Utilizing genomic signatures to gain insights into the dynamics of SARS-CoV-2 through Machine and Deep Learning techniques.利用基因组特征，通过机器学习和深度学习技术深入了解 SARS-CoV-2 的动态。

BMC Bioinformatics. 2024 Mar 27;25(1):131. doi: 10.1186/s12859-024-05648-2.

A One Health strategy for emerging infectious diseases based on the COVID-19 outbreak.基于新冠疫情的新发传染病“同一健康”战略。

J Biosaf Biosecur. 2022 Jun;4(1):5-11. doi: 10.1016/j.jobb.2021.09.003. Epub 2021 Oct 28.

本文引用的文献

Classification of COVID-19 and Other Pathogenic Sequences: A Dinucleotide Frequency and Machine Learning Approach.新型冠状病毒肺炎及其他致病序列的分类：一种二核苷酸频率与机器学习方法

IEEE Access. 2020 Oct 15;8:195263-195273. doi: 10.1109/ACCESS.2020.3031387. eCollection 2020.

On the origin and continuing evolution of SARS-CoV-2.关于严重急性呼吸综合征冠状病毒2（SARS-CoV-2）的起源及持续进化

Natl Sci Rev. 2020 Jun;7(6):1012-1023. doi: 10.1093/nsr/nwaa036. Epub 2020 Mar 3.

Analysis of SARS-CoV-2 mutations in the United States suggests presence of four substrains and novel variants.分析美国的 SARS-CoV-2 突变情况表明存在四个亚系和新型变体。

Commun Biol. 2021 Feb 15;4(1):228. doi: 10.1038/s42003-021-01754-6.

Classification and specific primer design for accurate detection of SARS-CoV-2 using deep learning.利用深度学习对 SARS-CoV-2 进行准确检测的分类和特异性引物设计。

Sci Rep. 2021 Jan 13;11(1):947. doi: 10.1038/s41598-020-80363-5.

No evidence for increased transmissibility from recurrent mutations in SARS-CoV-2.没有证据表明 SARS-CoV-2 反复出现的突变会增加传染性。

Nat Commun. 2020 Nov 25;11(1):5986. doi: 10.1038/s41467-020-19818-2.

Factors affecting COVID-19 infected and death rates inform lockdown-related policymaking.影响 COVID-19 感染率和死亡率的因素为与封锁相关的决策提供了信息。

PLoS One. 2020 Oct 23;15(10):e0241165. doi: 10.1371/journal.pone.0241165. eCollection 2020.

Machine learning techniques for sequence-based prediction of viral-host interactions between SARS-CoV-2 and human proteins.基于序列的 SARS-CoV-2 与人类蛋白质之间病毒-宿主相互作用的预测的机器学习技术。

Biomed J. 2020 Oct;43(5):438-450. doi: 10.1016/j.bj.2020.08.003. Epub 2020 Sep 3.

Forecasting of COVID-19 time series for countries in the world based on a hybrid approach combining the fractal dimension and fuzzy logic.基于分形维数与模糊逻辑相结合的混合方法对世界各国新冠疫情时间序列进行预测。

Chaos Solitons Fractals. 2020 Nov;140:110242. doi: 10.1016/j.chaos.2020.110242. Epub 2020 Aug 24.

Machine learning based approaches for detecting COVID-19 using clinical text data.基于机器学习的方法利用临床文本数据检测新冠肺炎。

Int J Inf Technol. 2020;12(3):731-739. doi: 10.1007/s41870-020-00495-9. Epub 2020 Jun 30.

Modeling COVID-19 epidemic in Heilongjiang province, China.中国黑龙江省新冠肺炎疫情建模

Chaos Solitons Fractals. 2020 Sep;138:109949. doi: 10.1016/j.chaos.2020.109949. Epub 2020 May 29.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

用于智能洲际 SARS-CoV-2 亚系特征描述和预测的混合计算框架。

A hybrid computational framework for intelligent inter-continent SARS-CoV-2 sub-strains characterization and prediction.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献