在DNA序列中寻找复合调控模式。

Finding composite regulatory patterns in DNA sequences.

作者信息

Eskin Eleazar, Pevzner Pavel A

机构信息

Department of Computer Science, Columbia University, New York, 10027 NY, USA.

出版信息

Bioinformatics. 2002;18 Suppl 1:S354-63. doi: 10.1093/bioinformatics/18.suppl_1.s354.

DOI:10.1093/bioinformatics/18.suppl_1.s354

PMID:12169566

Abstract

Pattern discovery in unaligned DNA sequences is a fundamental problem in computational biology with important applications in finding regulatory signals. Current approaches to pattern discovery focus on monad patterns that correspond to relatively short contiguous strings. However, many of the actual regulatory signals are composite patterns that are groups of monad patterns that occur near each other. A difficulty in discovering composite patterns is that one or both of the component monad patterns in the group may be 'too weak'. Since the traditional monad-based motif finding algorithms usually output one (or a few) high scoring patterns, they often fail to find composite regulatory signals consisting of weak monad parts. In this paper, we present a MITRA (MIsmatch TRee Algorithm) approach for discovering composite signals. We demonstrate that MITRA performs well for both monad and composite patterns by presenting experiments over biological and synthetic data.

摘要

在未比对的DNA序列中发现模式是计算生物学中的一个基本问题，在寻找调控信号方面有重要应用。当前用于模式发现的方法主要集中在与相对较短连续字符串相对应的单碱基模式上。然而，许多实际的调控信号是复合模式，即彼此靠近出现的单碱基模式组。发现复合模式的一个困难在于该组中的一个或两个组成单碱基模式可能“太弱”。由于传统的基于单碱基的基序发现算法通常输出一个（或几个）高分模式，它们常常无法找到由弱单碱基部分组成的复合调控信号。在本文中，我们提出了一种用于发现复合信号的MITRA（错配树算法）方法。通过对生物数据和合成数据进行实验，我们证明了MITRA在单碱基模式和复合模式方面都表现良好。

相似文献

Finding composite regulatory patterns in DNA sequences.

Bioinformatics. 2002;18 Suppl 1:S354-63. doi: 10.1093/bioinformatics/18.suppl_1.s354.

Efficient composite pattern finding from monad patterns.

Int J Bioinform Res Appl. 2007;3(1):86-99. doi: 10.1504/IJBRA.2007.011836.

PEAKS: identification of regulatory motifs by their position in DNA sequences.

Bioinformatics. 2007 Jan 15;23(2):243-4. doi: 10.1093/bioinformatics/btl568. Epub 2006 Nov 10.

Combining phylogenetic data with co-regulated genes to identify regulatory motifs.

Bioinformatics. 2003 Dec 12;19(18):2369-80. doi: 10.1093/bioinformatics/btg329.

Prediction of similarly acting cis-regulatory modules by subsequence profiling and comparative genomics in Drosophila melanogaster and D.pseudoobscura.

Bioinformatics. 2004 Nov 1;20(16):2738-50. doi: 10.1093/bioinformatics/bth320. Epub 2004 May 14.

Pattern locator: a new tool for finding local sequence patterns in genomic DNA sequences.

Bioinformatics. 2006 Dec 15;22(24):3099-100. doi: 10.1093/bioinformatics/btl551. Epub 2006 Nov 8.

Efficiently finding regulatory elements using correlation with gene expression.

J Bioinform Comput Biol. 2004 Jun;2(2):273-88. doi: 10.1142/s0219720004000612.

Efficient multiple genome alignment.

Bioinformatics. 2002;18 Suppl 1:S312-20. doi: 10.1093/bioinformatics/18.suppl_1.s312.

MISAE: a new approach for regulatory motif extraction.

Proc IEEE Comput Syst Bioinform Conf. 2004:173-81. doi: 10.1109/csb.2004.1332430.

A probabilistic method to detect regulatory modules.

Bioinformatics. 2003;19 Suppl 1:i292-301. doi: 10.1093/bioinformatics/btg1040.

引用本文的文献

A Review on Planted (, d) Motif Discovery Algorithms for Medical Diagnose.

Sensors (Basel). 2022 Feb 5;22(3):1204. doi: 10.3390/s22031204.

Review of Different Sequence Motif Finding Algorithms.

Avicenna J Med Biotechnol. 2019 Apr-Jun;11(2):130-148.

A study on the application of topic models to motif finding algorithms.

BMC Bioinformatics. 2016 Dec 22;17(Suppl 19):502. doi: 10.1186/s12859-016-1364-3.

A Comparative Analysis Between k-Mers and Community Detection-Based Features for the Task of Protein Classification.

IEEE Trans Nanobioscience. 2016 Mar;15(2):84-92. doi: 10.1109/TNB.2016.2523501. Epub 2016 Feb 3.

BLSSpeller: exhaustive comparative discovery of conserved cis-regulatory elements.

Bioinformatics. 2015 Dec 1;31(23):3758-66. doi: 10.1093/bioinformatics/btv466. Epub 2015 Aug 8.

SeqGL Identifies Context-Dependent Binding Signals in Genome-Wide Regulatory Element Maps.

PLoS Comput Biol. 2015 May 27;11(5):e1004271. doi: 10.1371/journal.pcbi.1004271. eCollection 2015 May.

qPMS9: an efficient algorithm for quorum Planted Motif Search.

Sci Rep. 2015 Jan 15;5:7813. doi: 10.1038/srep07813.

PMS6MC: A Multicore Algorithm for Motif Discovery.

Algorithms. 2013 Nov 18;6(4):805-823. doi: 10.3390/a6040805.

Detecting epigenetic motifs in low coverage and metagenomics settings.

BMC Bioinformatics. 2014;15 Suppl 9(Suppl 9):S16. doi: 10.1186/1471-2105-15-S9-S16. Epub 2014 Sep 10.

MoTeX-II: structured MoTif eXtraction from large-scale datasets.

BMC Bioinformatics. 2014 Jul 8;15:235. doi: 10.1186/1471-2105-15-235.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

在DNA序列中寻找复合调控模式。

Finding composite regulatory patterns in DNA sequences.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献