Issa Mohamed, Helmi Ahmed M, Elsheikh Ammar H, Abd Elaziz Mohamed
Computer and Systems Department, Faculty of Engineering, Zagazig University, Zagazig 44519, Egypt.
Engineering and Information Technology College, Buraydah Private Colleges, Buraydah 51418, Saudi Arabia.
Expert Syst Appl. 2022 Mar 1;189:116063. doi: 10.1016/j.eswa.2021.116063. Epub 2021 Oct 20.
The longest common consecutive subsequences (LCCS) play a vital role in revealing the biological relationships between DNA/RNA sequences especially the newly discovered ones such as COVID-19. FLAT is a Fragmented local aligner technique which is an accelerated version of the local pairwise sequence alignment algorithm based on -heuristic algorithms. The performance of FLAT needs to be enhanced since the huge length of biological sequences leads to trapping in local optima. This paper introduces a modified version of FLAT based on improving the performance of the BA algorithm by integration with particle swarm optimization (PSO) algorithm based on a novel infection mechanism. The proposed algorithm, named BPINF, depends on finding the best-explored solution using BA operators which can infect the agents during the exploitation phase using PSO operators to move toward it instead of moving toward the best-exploited solution. Hence, moving the solutions toward the two best solutions increase the diversity of generated solutions and avoids trapping in local optima. The infection can be propagated through the agents where each infected agent can transfer the infection to other non-infected agents which enhances the diversification of generated solutions. FLAT using the proposed technique (BPINF) was validated to detect LCCS between a set of real biological sequences with huge lengths besides COVID-19 and other well-known viruses. The performance of BPINF was compared to the enhanced versions of BA in the literature and the relevant studies of FLAT. It has a preponderance to find the LCCS with the highest percentage (88%) which is better than other state-of-the-art methods.
最长公共连续子序列(LCCS)在揭示DNA/RNA序列之间的生物学关系方面起着至关重要的作用,尤其是对于新发现的序列,如新冠病毒(COVID-19)。FLAT是一种片段化局部比对技术,它是基于启发式算法的局部成对序列比对算法的加速版本。由于生物序列长度巨大,导致FLAT容易陷入局部最优,因此需要提高其性能。本文介绍了一种改进版的FLAT,通过基于一种新颖感染机制与粒子群优化(PSO)算法相结合来提高BA算法的性能。所提出的算法名为BPINF,它依赖于使用BA算子找到最佳探索解,在利用阶段,BA算子可利用PSO算子朝着该最佳探索解移动,而不是朝着最佳利用解移动,从而使解朝着两个最佳解移动,增加了生成解的多样性,避免陷入局部最优。感染可以在个体之间传播,每个被感染的个体可以将感染传递给其他未感染的个体,这增强了生成解的多样性。使用所提出技术(BPINF)的FLAT经过验证,可用于检测一组除COVID-19和其他知名病毒外的超长真实生物序列之间的LCCS。将BPINF的性能与文献中BA的增强版本以及FLAT的相关研究进行了比较。它在找到最高百分比(88%)的LCCS方面具有优势,优于其他现有最先进方法。