Issa Mohamed
Computer and Systems Department, Faculty of Engineering, Zagazig University, Zagazig, Egypt.
Faculty of Computers and Informatics, Nahda University, Beni Suef, Egypt.
Appl Soft Comput. 2021 Jun;104:107197. doi: 10.1016/j.asoc.2021.107197. Epub 2021 Feb 20.
COVID-19 is a global pandemic that aroused the interest of scientists to prevent it and design a drug for it. Nowadays, presenting intelligent biological data analysis tools at a low cost is important to analyze the biological structure of COVID-19. The global alignment algorithm is one of the important bioinformatics tools that measure the most accurate similarity between a pair of biological sequences. The huge time consumption of the standard global alignment algorithm is its main limitation especially for sequences with huge lengths. This work proposed a fast global alignment tool (G-Aligner) based on meta-heuristic algorithms that estimate similarity measurements near the exact ones at a reasonable time with low cost. The huge length of sequences leads G-Aligner based on standard Sine-Cosine optimization algorithm (SCA) to trap in local minima. Therefore, an improved version of SCA was presented in this work that is based on integration with PSO. Besides, mutation and opposition operators are applied to enhance the exploration capability and avoiding trapping in local minima. The performance of the improved SCA algorithm (SP-MO) was evaluated on a set of IEEE CEC functions. Besides, G-Aligner based on the SP-MO algorithm was tested to measure the similarity of real biological sequence. It was used also to measure the similarity of the COVID-19 virus with the other 13 viruses to validate its performance. The tests concluded that the SP-MO algorithm has superiority over the relevant studies in the literature and produce the highest average similarity measurements 75% of the exact one.
新冠病毒肺炎(COVID - 19)是一场全球大流行疾病,引发了科学家们预防它并研发相关药物的兴趣。如今,以低成本提供智能生物数据分析工具对于分析COVID - 19的生物结构至关重要。全局比对算法是重要的生物信息学工具之一,用于测量一对生物序列之间最精确的相似度。标准全局比对算法巨大的时间消耗是其主要限制,尤其是对于长度巨大的序列。这项工作提出了一种基于元启发式算法的快速全局比对工具(G - Aligner),该工具能在合理时间内以低成本估计接近精确值的相似度测量值。序列的巨大长度导致基于标准正弦 - 余弦优化算法(SCA)的G - Aligner陷入局部最小值。因此,本文提出了一种基于与粒子群优化算法(PSO)集成的SCA改进版本。此外,应用了变异和反向算子来增强探索能力并避免陷入局部最小值。在一组IEEE CEC函数上评估了改进的SCA算法(SP - MO)的性能。此外,测试了基于SP - MO算法的G - Aligner来测量真实生物序列的相似度。它还被用于测量COVID - 19病毒与其他13种病毒的相似度以验证其性能。测试得出结论,SP - MO算法优于文献中的相关研究,并且产生的平均相似度测量值最高可达精确值的75%。