Suppr超能文献

COVID-深度预测器:用于预测严重急性呼吸综合征冠状病毒2及其他致病病毒的循环神经网络

COVID-DeepPredictor: Recurrent Neural Network to Predict SARS-CoV-2 and Other Pathogenic Viruses.

作者信息

Saha Indrajit, Ghosh Nimisha, Maity Debasree, Seal Arjit, Plewczynski Dariusz

机构信息

Department of Computer Science and Engineering, National Institute of Technical Teachers' Training and Research, Kolkata, India.

Department of Computer Science and Information Technology, Institute of Technical Education and Research, Siksha 'O' Anusandhan (Deemed to Be University), Bhubaneswar, India.

出版信息

Front Genet. 2021 Feb 11;12:569120. doi: 10.3389/fgene.2021.569120. eCollection 2021.

Abstract

The COVID-19 disease for Novel coronavirus (SARS-CoV-2) has turned out to be a global pandemic. The high transmission rate of this pathogenic virus demands an early prediction and proper identification for the subsequent treatment. However, polymorphic nature of this virus allows it to adapt and sustain in different kinds of environment which makes it difficult to predict. On the other hand, there are other pathogens like SARS-CoV-1, MERS-CoV, Ebola, Dengue, and Influenza as well, so that a predictor is highly required to distinguish them with the use of their genomic information. To mitigate this problem, in this work COVID-DeepPredictor is proposed on the framework of deep learning to identify an unknown sequence of these pathogens. COVID-DeepPredictor uses Long Short Term Memory as Recurrent Neural Network for the underlying prediction with an alignment-free technique. In this regard, -mer technique is applied to create Bag-of-Descriptors (BoDs) in order to generate Bag-of-Unique-Descriptors (BoUDs) as vocabulary and subsequently embedded representation is prepared for the given virus sequences. This predictor is not only validated for the dataset using -fold cross-validation but also for unseen test datasets of SARS-CoV-2 sequences and sequences from other viruses as well. To verify the efficacy of COVID-DeepPredictor, it has been compared with other state-of-the-art prediction techniques based on Linear Discriminant Analysis, Random Forests, and Gradient Boosting Method. COVID-DeepPredictor achieves 100% prediction accuracy on validation dataset while on test datasets, the accuracy ranges from 99.51 to 99.94%. It shows superior results over other prediction techniques as well. In addition to this, accuracy and runtime of COVID-DeepPredictor are considered simultaneously to determine the value of in -mer, a comparative study among values in -mer, Bag-of-Descriptors (BoDs), and Bag-of-Unique-Descriptors (BoUDs) and a comparison between COVID-DeepPredictor and Nucleotide BLAST have also been performed. The code, training, and test datasets used for COVID-DeepPredictor are available at .

摘要

新型冠状病毒(SARS-CoV-2)引发的COVID-19疾病已成为全球大流行。这种致病病毒的高传播率要求进行早期预测和准确识别,以便后续治疗。然而,这种病毒的多态性使其能够在不同环境中适应和生存,这使得预测变得困难。另一方面,还有其他病原体,如SARS-CoV-1、MERS-CoV、埃博拉病毒、登革热病毒和流感病毒等,因此迫切需要一种预测器,利用它们的基因组信息来区分它们。为缓解这一问题,本文基于深度学习框架提出了COVID-DeepPredictor,用于识别这些病原体的未知序列。COVID-DeepPredictor使用长短期记忆作为递归神经网络,采用无比对技术进行基础预测。在这方面,应用-mer技术创建描述符袋(BoDs),以生成唯一描述符袋(BoUDs)作为词汇表,随后为给定的病毒序列准备嵌入表示。该预测器不仅使用-fold交叉验证对数据集进行了验证,还对SARS-CoV-2序列的未见测试数据集以及其他病毒的序列进行了验证。为验证COVID-DeepPredictor的有效性,将其与基于线性判别分析、随机森林和梯度提升方法的其他先进预测技术进行了比较。COVID-DeepPredictor在验证数据集上实现了100%的预测准确率,而在测试数据集上,准确率范围为99.51%至99.94%。它也显示出优于其他预测技术的结果。除此之外,同时考虑了COVID-DeepPredictor的准确率和运行时间,以确定-mer中的值,还对-mer中的值、描述符袋(BoDs)和唯一描述符袋(BoUDs)进行了比较研究,并将COVID-DeepPredictor与核苷酸BLAST进行了比较。用于COVID-DeepPredictor的代码、训练和测试数据集可在获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/72aa/7906283/0aea84a365e7/fgene-12-569120-g0001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验