SeAMotE：一种用于在核酸序列中进行高通量基序发现的方法。

SeAMotE: a method for high-throughput motif discovery in nucleic acid sequences.

作者信息

Agostini Federico, Cirillo Davide, Ponti Riccardo Delli, Tartaglia Gian Gaetano

机构信息

Gene Function and Evolution, Centre for Genomic Regulation (CRG), C/ Dr, Aiguader 88, 08003 Barcelona, Spain.

出版信息

BMC Genomics. 2014 Oct 23;15(1):925. doi: 10.1186/1471-2164-15-925.

DOI:10.1186/1471-2164-15-925

PMID:25341390

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4223730/

Abstract

BACKGROUND

The large amount of data produced by high-throughput sequencing poses new computational challenges. In the last decade, several tools have been developed for the identification of transcription and splicing factor binding sites.

RESULTS

Here, we introduce the SeAMotE (Sequence Analysis of Motifs Enrichment) algorithm for discovery of regulatory regions in nucleic acid sequences. SeAMotE provides (i) a robust analysis of high-throughput sequence sets, (ii) a motif search based on pattern occurrences and (iii) an easy-to-use web-server interface. We applied our method to recently published data including 351 chromatin immunoprecipitation (ChIP) and 13 crosslinking immunoprecipitation (CLIP) experiments and compared our results with those of other well-established motif discovery tools. SeAMotE shows an average accuracy of 80% in finding discriminative motifs and outperforms other methods available in literature.

CONCLUSIONS

SeAMotE is a fast, accurate and flexible algorithm for the identification of sequence patterns involved in protein-DNA and protein-RNA recognition. The server can be freely accessed at http://s.tartaglialab.com/new_submission/seamote.

摘要

背景

高通量测序产生的大量数据带来了新的计算挑战。在过去十年中，已经开发了几种用于识别转录和剪接因子结合位点的工具。

结果

在此，我们介绍了用于发现核酸序列中调控区域的SeAMotE（基序富集序列分析）算法。SeAMotE提供了（i）对高通量序列集的稳健分析，（ii）基于模式出现情况的基序搜索，以及（iii）易于使用的网络服务器界面。我们将我们的方法应用于最近发表的数据，包括351个染色质免疫沉淀（ChIP）和13个交联免疫沉淀（CLIP）实验，并将我们的结果与其他成熟的基序发现工具的结果进行了比较。SeAMotE在发现有鉴别力的基序方面显示出平均80%的准确率，并且优于文献中可用的其他方法。

结论

SeAMotE是一种用于识别参与蛋白质-DNA和蛋白质-RNA识别的序列模式的快速、准确且灵活的算法。该服务器可通过http://s.tartaglialab.com/new_submission/seamote免费访问。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d4e6/4223730/15a163469af4/12864_2014_6626_Fig1_HTML.jpg

相似文献

SeAMotE: a method for high-throughput motif discovery in nucleic acid sequences.

BMC Genomics. 2014 Oct 23;15(1):925. doi: 10.1186/1471-2164-15-925.

TrawlerWeb: an online de novo motif discovery tool for next-generation sequencing datasets.

BMC Genomics. 2018 Apr 5;19(1):238. doi: 10.1186/s12864-018-4630-0.

MEME-ChIP: motif analysis of large DNA datasets.

Bioinformatics. 2011 Jun 15;27(12):1696-7. doi: 10.1093/bioinformatics/btr189. Epub 2011 Apr 12.

An algorithm for finding protein-DNA binding sites with applications to chromatin-immunoprecipitation microarray experiments.

Nat Biotechnol. 2002 Aug;20(8):835-9. doi: 10.1038/nbt717. Epub 2002 Jul 8.

STREME: accurate and versatile sequence motif discovery.

Bioinformatics. 2021 Sep 29;37(18):2834-2840. doi: 10.1093/bioinformatics/btab203.

A Monte Carlo-based framework enhances the discovery and interpretation of regulatory sequence motifs.

BMC Bioinformatics. 2012 Nov 27;13:317. doi: 10.1186/1471-2105-13-317.

catRAPID omics: a web server for large-scale prediction of protein-RNA interactions.

Bioinformatics. 2013 Nov 15;29(22):2928-30. doi: 10.1093/bioinformatics/btt495. Epub 2013 Aug 23.

Discovering motifs in ranked lists of DNA sequences.

PLoS Comput Biol. 2007 Mar 23;3(3):e39. doi: 10.1371/journal.pcbi.0030039.

An Efficient Algorithm for Discovering Motifs in Large DNA Data Sets.

IEEE Trans Nanobioscience. 2015 Jul;14(5):535-44. doi: 10.1109/TNB.2015.2421340. Epub 2015 Apr 9.

Discriminative motif discovery in DNA and protein sequences using the DEME algorithm.

BMC Bioinformatics. 2007 Oct 15;8:385. doi: 10.1186/1471-2105-8-385.

引用本文的文献

Design and characterization of G-quadruplex RNA aptamers reveal RNA-binding by KDM5 lysine demethylases.

Comput Struct Biotechnol J. 2025 Jun 16;27:2719-2729. doi: 10.1016/j.csbj.2025.06.027. eCollection 2025.

In-silico identification and comparison of transcription factor binding sites cluster in anterior-posterior patterning genes in Drosophila melanogaster and Tribolium castaneum.

PLoS One. 2023 Aug 17;18(8):e0290035. doi: 10.1371/journal.pone.0290035. eCollection 2023.

Zooming in on protein-RNA interactions: a multi-level workflow to identify interaction partners.

Biochem Soc Trans. 2020 Aug 28;48(4):1529-1543. doi: 10.1042/BST20191059.

Mechanisms and consequences of subcellular RNA localization across diverse cell types.

Traffic. 2020 Jun;21(6):404-418. doi: 10.1111/tra.12730. Epub 2020 Apr 29.

Direct AUC optimization of regulatory motifs.

Bioinformatics. 2017 Jul 15;33(14):i243-i251. doi: 10.1093/bioinformatics/btx255.

WSMD: weakly-supervised motif discovery in transcription factor ChIP-seq data.

Sci Rep. 2017 Jun 12;7(1):3217. doi: 10.1038/s41598-017-03554-7.

Combining phylogenetic footprinting with motif models incorporating intra-motif dependencies.

BMC Bioinformatics. 2017 Mar 1;18(1):141. doi: 10.1186/s12859-017-1495-1.

Advances in the characterization of RNA-binding proteins.

Wiley Interdiscip Rev RNA. 2016 Nov;7(6):793-810. doi: 10.1002/wrna.1378. Epub 2016 Aug 8.

The orientation of transcription factor binding site motifs in gene promoter regions: does it matter?

BMC Genomics. 2016 Mar 3;17:185. doi: 10.1186/s12864-016-2549-x.

DynaMIT: the dynamic motif integration toolkit.

Nucleic Acids Res. 2016 Jan 8;44(1):e2. doi: 10.1093/nar/gkv807. Epub 2015 Aug 7.

本文引用的文献

Discriminative motif optimization based on perceptron training.

Bioinformatics. 2014 Apr 1;30(7):941-8. doi: 10.1093/bioinformatics/btt748. Epub 2013 Dec 24.

JASPAR 2014: an extensively expanded and updated open-access database of transcription factor binding profiles.

Nucleic Acids Res. 2014 Jan;42(Database issue):D142-7. doi: 10.1093/nar/gkt997. Epub 2013 Nov 4.

Discriminative motif analysis of high-throughput dataset.

Bioinformatics. 2014 Mar 15;30(6):775-83. doi: 10.1093/bioinformatics/btt615. Epub 2013 Oct 25.

The next-generation sequencing revolution and its impact on genomics.

Cell. 2013 Sep 26;155(1):27-38. doi: 10.1016/j.cell.2013.09.006.

A general approach for discriminative de novo motif discovery from high-throughput data.

Nucleic Acids Res. 2013 Nov;41(21):e197. doi: 10.1093/nar/gkt831. Epub 2013 Sep 20.

The next generation of transcription factor binding site prediction.

PLoS Comput Biol. 2013;9(9):e1003214. doi: 10.1371/journal.pcbi.1003214. Epub 2013 Sep 5.

catRAPID omics: a web server for large-scale prediction of protein-RNA interactions.

Bioinformatics. 2013 Nov 15;29(22):2928-30. doi: 10.1093/bioinformatics/btt495. Epub 2013 Aug 23.

Eukaryotic transcriptional dynamics: from single molecules to cell populations.

Nat Rev Genet. 2013 Aug;14(8):572-84. doi: 10.1038/nrg3484. Epub 2013 Jul 9.

Evaluation of methods for modeling transcription factor sequence specificity.

Nat Biotechnol. 2013 Feb;31(2):126-34. doi: 10.1038/nbt.2486. Epub 2013 Jan 27.

DNA-binding specificities of human transcription factors.

Cell. 2013 Jan 17;152(1-2):327-39. doi: 10.1016/j.cell.2012.12.009.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

SeAMotE：一种用于在核酸序列中进行高通量基序发现的方法。

SeAMotE: a method for high-throughput motif discovery in nucleic acid sequences.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献