Xu Jin, Li Qiwei, Li Victor O K, Li Shuo-Yen Robert, Fan Xiaodan
Department of Electrical and Electronic Engineering, The University of Hong Kong, Pokfulam Road, Hong Kong.
Department of Statistics, The Chinese University of Hong Kong, Sha Tin, Hong Kong.
Int J Data Min Bioinform. 2013;8(4):462-79. doi: 10.1504/ijdmb.2013.056614.
This paper employs three Evolutionary Monte Carlo (EMC) schemes to solve the Short Adjacent Repeat Identification Problem (SARIP), which aims to identify the common repeat units shared by multiple sequences. The three EMC schemes, i.e., Random Exchange (RE), Best Exchange (BE), and crossover are implemented on a parallel platform. The simulation results show that compared with the conventional Markov Chain Monte Carlo (MCMC) algorithm, all three EMC schemes can not only shorten the computation time via speeding up the convergence but also improve the solution quality in difficult cases. Moreover, we observe that the performances of different EMC schemes depend on the degeneracy degree of the motif pattern.
本文采用三种进化蒙特卡罗(EMC)方案来解决短相邻重复序列识别问题(SARIP),该问题旨在识别多个序列共有的重复单元。这三种EMC方案,即随机交换(RE)、最佳交换(BE)和交叉,在并行平台上实现。仿真结果表明,与传统的马尔可夫链蒙特卡罗(MCMC)算法相比,所有三种EMC方案不仅可以通过加快收敛速度来缩短计算时间,而且在困难情况下还能提高解的质量。此外,我们观察到不同EMC方案的性能取决于基序模式的简并度。