Laboratory of Machine Learning and Intelligent Instrumentation, Federal University of Rio Grande do Norte, Natal, RN, 59078-970, Brazil.
Department of Pharmacy and Pharmaceutical Technology, University of Granada, Granada, Spain.
BMC Bioinformatics. 2023 Mar 11;24(1):92. doi: 10.1186/s12859-023-05188-1.
In December 2019, the first case of COVID-19 was described in Wuhan, China, and by July 2022, there were already 540 million confirmed cases. Due to the rapid spread of the virus, the scientific community has made efforts to develop techniques for the viral classification of SARS-CoV-2.
In this context, we developed a new proposal for gene sequence representation with Genomic Signal Processing techniques for the work presented in this paper. First, we applied the mapping approach to samples of six viral species of the Coronaviridae family, which belongs SARS-CoV-2 Virus. We then used the sequence downsized obtained by the method proposed in a deep learning architecture for viral classification, achieving an accuracy of 98.35%, 99.08%, and 99.69% for the 64, 128, and 256 sizes of the viral signatures, respectively, and obtaining 99.95% precision for the vectors with size 256.
The classification results obtained, in comparison to the results produced using other state-of-the-art representation techniques, demonstrate that the proposed mapping can provide a satisfactory performance result with low computational memory and processing time costs.
2019 年 12 月,中国武汉首次描述了 COVID-19 病例,到 2022 年 7 月,已确诊病例已达 5.4 亿例。由于病毒的迅速传播,科学界已经努力开发 SARS-CoV-2 的病毒分类技术。
在这种情况下,我们针对本文提出了一种新的基因序列表示方法,该方法结合了基因组信号处理技术。首先,我们将该方法应用于冠状病毒科的六种病毒物种的样本,这些样本属于 SARS-CoV-2 病毒。然后,我们使用通过所提出的方法获得的序列缩减后的方法在深度学习架构中进行病毒分类,分别获得了 64、128 和 256 大小的病毒特征的 98.35%、99.08%和 99.69%的准确性,并且获得了大小为 256 的向量的 99.95%的精度。
与使用其他最先进的表示技术相比,所获得的分类结果表明,所提出的映射可以提供令人满意的性能结果,同时具有低计算内存和处理时间成本。