一种用于在DNA序列中寻找基序的多目标帝国主义竞争算法（MOICA）。

A multi-objective imperialist competitive algorithm (MOICA) for finding motifs in DNA sequences.

作者信息

Gohardani Saeed Alirezanejad, Bagherian Mehri, Vaziri Hamidreza

机构信息

Department of Applied Mathematics, Faculty of Mathematical Science, University of Guilan, Rasht, Iran.

Department of Biology, Faculty of Science, University of Guilan, Rasht, Iran.

出版信息

Math Biosci Eng. 2019 Feb 26;16(3):1575-1596. doi: 10.3934/mbe.2019075.

DOI:10.3934/mbe.2019075

PMID:30947433

Abstract

Motif discovery problem (MDP) is one of the well-known problems in biology which tries to find the transcription factor binding site (TFBS) in DNA sequences. In one aspect, there is not enough biological knowledge on motif sites and on the other side, the problem is NP-hard. Thus, there is not an efficient procedure capable of finding motifs in every dataset. Some algorithms use exhaustive search, which is very time-consuming for large-scale datasets. On the other side, metaheuristic procedures seem to be a good selection for finding a motif quickly that at least has some acceptable biological properties. Most of the previous methods model the problem as a single objective optimization problem; however, considering multi-objectives for modeling the problem leads to improvements in the quality of obtained motifs. Some multi-objective optimization models for MDP have tried to maximize three objectives simultaneously: Motif length, support, and similarity. In this study, the multi-objective Imperialist Competition Algorithm (ICA) is adopted for this problem as an approximation algorithm. ICA is able to simulate more exploration along the solution space, so avoids trapping into local optima. So, it promises to obtain good solutions in a reasonable time. Experimental results show that our method produces good solutions compared to well-known algorithms in the literature, according to computational and biological indicators.

摘要

基序发现问题（MDP）是生物学中著名的问题之一，旨在在DNA序列中寻找转录因子结合位点（TFBS）。一方面，关于基序位点的生物学知识不足，另一方面，该问题是NP难问题。因此，不存在一种能够在每个数据集中找到基序的有效程序。一些算法使用穷举搜索，这对于大规模数据集来说非常耗时。另一方面，元启发式程序似乎是快速找到至少具有一些可接受生物学特性的基序的不错选择。以前的大多数方法将该问题建模为单目标优化问题；然而，考虑多目标来对问题进行建模会提高所获得基序的质量。一些用于MDP的多目标优化模型试图同时最大化三个目标：基序长度、支持度和相似度。在本研究中，采用多目标帝国主义竞争算法（ICA）作为该问题的近似算法。ICA能够在解空间中模拟更多的探索，因此避免陷入局部最优。所以，它有望在合理的时间内获得良好的解。实验结果表明，根据计算指标和生物学指标，与文献中著名的算法相比，我们的方法产生了良好的解。

相似文献

A multi-objective imperialist competitive algorithm (MOICA) for finding motifs in DNA sequences.一种用于在DNA序列中寻找基序的多目标帝国主义竞争算法（MOICA）。

Math Biosci Eng. 2019 Feb 26;16(3):1575-1596. doi: 10.3934/mbe.2019075.

A cluster refinement algorithm for motif discovery.一种用于发现模体的簇精炼算法。

IEEE/ACM Trans Comput Biol Bioinform. 2010 Oct-Dec;7(4):654-68. doi: 10.1109/TCBB.2009.25.

Voting algorithms for the motif finding problem.用于基序查找问题的投票算法。

Comput Syst Bioinformatics Conf. 2008;7:37-47.

SamSelect: a sample sequence selection algorithm for quorum planted motif search on large DNA datasets.SamSelect：一种用于在大型 DNA 数据集上进行约定种植基序搜索的样本序列选择算法。

BMC Bioinformatics. 2018 Jun 18;19(1):228. doi: 10.1186/s12859-018-2242-y.

Efficient sequential and parallel algorithms for finding edit distance based motifs.用于查找基于编辑距离的基序的高效顺序和并行算法。

BMC Genomics. 2016 Aug 18;17 Suppl 4(Suppl 4):465. doi: 10.1186/s12864-016-2789-9.

HIGEDA: a hierarchical gene-set genetics based algorithm for finding subtle motifs in biological sequences.HIGEDA：一种基于层次基因集遗传学的算法，用于在生物序列中寻找微妙的模体。

Bioinformatics. 2010 Feb 1;26(3):302-9. doi: 10.1093/bioinformatics/btp676. Epub 2009 Dec 8.

A study on the application of topic models to motif finding algorithms.主题模型在基序查找算法中的应用研究。

BMC Bioinformatics. 2016 Dec 22;17(Suppl 19):502. doi: 10.1186/s12859-016-1364-3.

qPMS7: a fast algorithm for finding (ℓ, d)-motifs in DNA and protein sequences.qPMS7：一种在 DNA 和蛋白质序列中查找（ℓ，d）-基序的快速算法。

PLoS One. 2012;7(7):e41425. doi: 10.1371/journal.pone.0041425. Epub 2012 Jul 24.

An Algorithm for Motif Discovery with Iteration on Lengths of Motifs.一种基于基序长度迭代的基序发现算法。

IEEE/ACM Trans Comput Biol Bioinform. 2015 Jan-Feb;12(1):136-41. doi: 10.1109/TCBB.2014.2351793.

Efficient motif finding algorithms for large-alphabet inputs.针对大字母表输入的高效基序发现算法。

BMC Bioinformatics. 2010 Oct 26;11 Suppl 8(Suppl 8):S1. doi: 10.1186/1471-2105-11-S8-S1.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

一种用于在DNA序列中寻找基序的多目标帝国主义竞争算法（MOICA）。

A multi-objective imperialist competitive algorithm (MOICA) for finding motifs in DNA sequences.

作者信息

机构信息

出版信息

相似文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献