Suppr超能文献

使用人工智能技术进行新冠病毒基因组分析。

Using artificial intelligence techniques for COVID-19 genome analysis.

作者信息

Nawaz M Saqib, Fournier-Viger Philippe, Shojaee Abbas, Fujita Hamido

机构信息

School of Humanities and Social Sciences, Harbin Institute of Technology (Shenzhen), Shenzhen, China.

Yale University School of Medicine, New Haven, USA.

出版信息

Appl Intell (Dordr). 2021;51(5):3086-3103. doi: 10.1007/s10489-021-02193-w. Epub 2021 Feb 17.

Abstract

The genome of the novel coronavirus (COVID-19) disease was first sequenced in January 2020, approximately a month after its emergence in Wuhan, capital of Hubei province, China. COVID-19 genome sequencing is critical to understanding the virus behavior, its origin, how fast it mutates, and for the development of drugs/vaccines and effective preventive strategies. This paper investigates the use of artificial intelligence techniques to learn interesting information from COVID-19 genome sequences. Sequential pattern mining (SPM) is first applied on a computer-understandable corpus of COVID-19 genome sequences to see if interesting hidden patterns can be found, which reveal frequent patterns of nucleotide bases and their relationships with each other. Second, sequence prediction models are applied to the corpus to evaluate if nucleotide base(s) can be predicted from previous ones. Third, for mutation analysis in genome sequences, an algorithm is designed to find the locations in the genome sequences where the nucleotide bases are changed and to calculate the mutation rate. Obtained results suggest that SPM and mutation analysis techniques can reveal interesting information and patterns in COVID-19 genome sequences to examine the evolution and variations in COVID-19 strains respectively.

摘要

新型冠状病毒(COVID-19)疾病的基因组于2020年1月首次测序,此时距离该病毒在中国湖北省省会武汉出现大约过去了一个月。COVID-19基因组测序对于了解病毒行为、其起源、变异速度以及开发药物/疫苗和有效的预防策略至关重要。本文研究了使用人工智能技术从COVID-19基因组序列中获取有趣信息的方法。首先将序列模式挖掘(SPM)应用于可被计算机理解的COVID-19基因组序列语料库,以查看是否能发现有趣的隐藏模式,这些模式揭示了核苷酸碱基的频繁模式及其相互关系。其次,将序列预测模型应用于该语料库,以评估是否可以根据前序核苷酸碱基预测后续核苷酸碱基。第三,针对基因组序列中的突变分析,设计了一种算法来找到基因组序列中核苷酸碱基发生变化的位置,并计算突变率。获得的结果表明,SPM和突变分析技术可以揭示COVID-19基因组序列中的有趣信息和模式,从而分别研究COVID-19毒株的进化和变异情况。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验