复合基序发现方法的评估。

Assessment of composite motif discovery methods.

作者信息

Klepper Kjetil, Sandve Geir K, Abul Osman, Johansen Jostein, Drablos Finn

机构信息

Department of Cancer Reasearch and Molecular Medicine, Norwegian University of Science and Technology, Trondheim, Norway.

出版信息

BMC Bioinformatics. 2008 Feb 26;9:123. doi: 10.1186/1471-2105-9-123.

DOI:10.1186/1471-2105-9-123

PMID:18302777

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2311304/

Abstract

BACKGROUND

Computational discovery of regulatory elements is an important area of bioinformatics research and more than a hundred motif discovery methods have been published. Traditionally, most of these methods have addressed the problem of single motif discovery - discovering binding motifs for individual transcription factors. In higher organisms, however, transcription factors usually act in combination with nearby bound factors to induce specific regulatory behaviours. Hence, recent focus has shifted from single motifs to the discovery of sets of motifs bound by multiple cooperating transcription factors, so called composite motifs or cis-regulatory modules. Given the large number and diversity of methods available, independent assessment of methods becomes important. Although there have been several benchmark studies of single motif discovery, no similar studies have previously been conducted concerning composite motif discovery.

RESULTS

We have developed a benchmarking framework for composite motif discovery and used it to evaluate the performance of eight published module discovery tools. Benchmark datasets were constructed based on real genomic sequences containing experimentally verified regulatory modules, and the module discovery programs were asked to predict both the locations of these modules and to specify the single motifs involved. To aid the programs in their search, we provided position weight matrices corresponding to the binding motifs of the transcription factors involved. In addition, selections of decoy matrices were mixed with the genuine matrices on one dataset to test the response of programs to varying levels of noise.

CONCLUSION

Although some of the methods tested tended to score somewhat better than others overall, there were still large variations between individual datasets and no single method performed consistently better than the rest in all situations. The variation in performance on individual datasets also shows that the new benchmark datasets represents a suitable variety of challenges to most methods for module discovery.

摘要

背景

调控元件的计算发现是生物信息学研究的一个重要领域，已发表了一百多种基序发现方法。传统上，这些方法大多解决的是单基序发现问题——发现单个转录因子的结合基序。然而，在高等生物中，转录因子通常与附近结合的因子协同作用以诱导特定的调控行为。因此，最近的研究重点已从单基序转移到发现由多个协同转录因子结合的基序集，即所谓的复合基序或顺式调控模块。鉴于现有方法数量众多且种类各异，对方法进行独立评估变得很重要。虽然已经有几项关于单基序发现的基准研究，但此前尚未针对复合基序发现进行类似研究。

结果

我们开发了一个用于复合基序发现的基准框架，并使用它来评估八种已发表的模块发现工具的性能。基于包含经实验验证的调控模块的真实基因组序列构建基准数据集，并要求模块发现程序预测这些模块的位置并指定其中涉及的单基序。为帮助程序进行搜索，我们提供了与所涉及转录因子的结合基序相对应的位置权重矩阵。此外，在一个数据集上，将诱饵矩阵的选择与真实矩阵混合，以测试程序对不同噪声水平的响应。

结论

虽然总体上一些测试方法的得分往往比其他方法略高，但各个数据集之间仍存在很大差异，没有一种方法在所有情况下都始终比其他方法表现得更好。单个数据集上性能的差异也表明，新的基准数据集对大多数模块发现方法构成了合适的各种挑战。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7083/2311304/220b2909f44a/1471-2105-9-123-1.jpg

相似文献

Assessment of composite motif discovery methods.

BMC Bioinformatics. 2008 Feb 26;9:123. doi: 10.1186/1471-2105-9-123.

Improved benchmarks for computational motif discovery.

BMC Bioinformatics. 2007 Jun 8;8:193. doi: 10.1186/1471-2105-8-193.

A survey of DNA motif finding algorithms.

BMC Bioinformatics. 2007 Nov 1;8 Suppl 7(Suppl 7):S21. doi: 10.1186/1471-2105-8-S7-S21.

STAMP: a web tool for exploring DNA-binding motif similarities.

Nucleic Acids Res. 2007 Jul;35(Web Server issue):W253-8. doi: 10.1093/nar/gkm272. Epub 2007 May 3.

On counting position weight matrix matches in a sequence, with application to discriminative motif finding.

Bioinformatics. 2006 Jul 15;22(14):e454-63. doi: 10.1093/bioinformatics/btl227.

MotifLab: a tools and data integration workbench for motif discovery and regulatory sequence analysis.

BMC Bioinformatics. 2013 Jan 16;14:9. doi: 10.1186/1471-2105-14-9.

Informative priors based on transcription factor structural class improve de novo motif discovery.

Bioinformatics. 2006 Jul 15;22(14):e384-92. doi: 10.1093/bioinformatics/btl251.

Phylogeny based discovery of regulatory elements.

BMC Bioinformatics. 2006 May 22;7:266. doi: 10.1186/1471-2105-7-266.

Meta-analysis discovery of tissue-specific DNA sequence motifs from mammalian gene expression data.

BMC Bioinformatics. 2006 Apr 27;7:229. doi: 10.1186/1471-2105-7-229.

MATLIGN: a motif clustering, comparison and matching tool.

BMC Bioinformatics. 2007 Jun 8;8:189. doi: 10.1186/1471-2105-8-189.

引用本文的文献

GAMI-CRM: Using de novo motif inference to detect cis-regulatory modules.

Proc Congr Evol Comput. 2014 Jul;2014. doi: 10.1109/cec.2014.6900542. Epub 2014 Sep 22.

BestCRM: An Exhaustive Search for Optimal Cis-Regulatory Modules in Promoters Accelerated by the Multidimensional Hash Function.

Int J Mol Sci. 2024 Feb 5;25(3):1903. doi: 10.3390/ijms25031903.

Guidelines on the performance evaluation of motif recognition methods in bioinformatics.

Front Genet. 2023 Feb 7;14:1135320. doi: 10.3389/fgene.2023.1135320. eCollection 2023.

A New Algorithm for Identifying Cis-Regulatory Modules Based on Hidden Markov Model.

Biomed Res Int. 2017;2017:6274513. doi: 10.1155/2017/6274513. Epub 2017 Apr 11.

Evaluating tools for transcription factor binding site prediction.

BMC Bioinformatics. 2016 Nov 2;17(1):547. doi: 10.1186/s12859-016-1298-9.

SMCis: An Effective Algorithm for Discovery of Cis-Regulatory Modules.

PLoS One. 2016 Sep 16;11(9):e0162968. doi: 10.1371/journal.pone.0162968. eCollection 2016.

Modeling protein-DNA binding via high-throughput in vitro technologies.

Brief Funct Genomics. 2017 May 1;16(3):171-180. doi: 10.1093/bfgp/elw030.

PC-TraFF: identification of potentially collaborating transcription factors using pointwise mutual information.

BMC Bioinformatics. 2015 Dec 1;16:400. doi: 10.1186/s12859-015-0827-2.

MatrixCatch--a novel tool for the recognition of composite regulatory elements in promoters.

BMC Bioinformatics. 2013 Aug 8;14:241. doi: 10.1186/1471-2105-14-241.

MotifLab: a tools and data integration workbench for motif discovery and regulatory sequence analysis.

BMC Bioinformatics. 2013 Jan 16;14:9. doi: 10.1186/1471-2105-14-9.

本文引用的文献

Composite Module Analyst: identification of transcription factor binding site combinations using genetic algorithm.

Nucleic Acids Res. 2006 Jul 1;34(Web Server issue):W541-5. doi: 10.1093/nar/gkl342.

A survey of motif discovery methods in an integrated framework.

Biol Direct. 2006 Apr 6;1:11. doi: 10.1186/1745-6150-1-11.

Composite Module Analyst: a fitness-based tool for identification of transcription factor binding site combinations.

Bioinformatics. 2006 May 15;22(10):1190-7. doi: 10.1093/bioinformatics/btl041. Epub 2006 Feb 10.

TRANSFAC and its module TRANSCompel: transcriptional gene regulation in eukaryotes.

Nucleic Acids Res. 2006 Jan 1;34(Database issue):D108-10. doi: 10.1093/nar/gkj143.

Limitations and potentials of current motif discovery algorithms.

Nucleic Acids Res. 2005 Sep 2;33(15):4899-913. doi: 10.1093/nar/gki791. Print 2005.

Using hexamers to predict cis-regulatory motifs in Drosophila.

BMC Bioinformatics. 2005 Oct 27;6:262. doi: 10.1186/1471-2105-6-262.

BMC Bioinformatics. 2005 Sep 28;6:237. doi: 10.1186/1471-2105-6-237.

Detection of coregulation in differential gene expression profiles.

Biosystems. 2005 Dec;82(3):235-47. doi: 10.1016/j.biosystems.2005.08.001. Epub 2005 Sep 21.

TOUCAN 2: the all-inclusive open source workbench for regulatory sequence analysis.

Nucleic Acids Res. 2005 Jul 1;33(Web Server issue):W393-6. doi: 10.1093/nar/gki354.

Assessing computational tools for the discovery of transcription factor binding sites.

Nat Biotechnol. 2005 Jan;23(1):137-44. doi: 10.1038/nbt1053.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

复合基序发现方法的评估。

Assessment of composite motif discovery methods.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSION

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献