从头发现差异丰度转录因子结合位点，包括其位置偏好。

De-novo discovery of differentially abundant transcription factor binding sites including their positional preference.

机构信息

Molecular Genetics, Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Gatersleben, Germany.

出版信息

PLoS Comput Biol. 2011 Feb 10;7(2):e1001070. doi: 10.1371/journal.pcbi.1001070.

DOI:10.1371/journal.pcbi.1001070

PMID:21347314

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3037384/

Abstract

Transcription factors are a main component of gene regulation as they activate or repress gene expression by binding to specific binding sites in promoters. The de-novo discovery of transcription factor binding sites in target regions obtained by wet-lab experiments is a challenging problem in computational biology, which has not been fully solved yet. Here, we present a de-novo motif discovery tool called Dispom for finding differentially abundant transcription factor binding sites that models existing positional preferences of binding sites and adjusts the length of the motif in the learning process. Evaluating Dispom, we find that its prediction performance is superior to existing tools for de-novo motif discovery for 18 benchmark data sets with planted binding sites, and for a metazoan compendium based on experimental data from micro-array, ChIP-chip, ChIP-DSL, and DamID as well as Gene Ontology data. Finally, we apply Dispom to find binding sites differentially abundant in promoters of auxin-responsive genes extracted from Arabidopsis thaliana microarray data, and we find a motif that can be interpreted as a refined auxin responsive element predominately positioned in the 250-bp region upstream of the transcription start site. Using an independent data set of auxin-responsive genes, we find in genome-wide predictions that the refined motif is more specific for auxin-responsive genes than the canonical auxin-responsive element. In general, Dispom can be used to find differentially abundant motifs in sequences of any origin. However, the positional distribution learned by Dispom is especially beneficial if all sequences are aligned to some anchor point like the transcription start site in case of promoter sequences. We demonstrate that the combination of searching for differentially abundant motifs and inferring a position distribution from the data is beneficial for de-novo motif discovery. Hence, we make the tool freely available as a component of the open-source Java framework Jstacs and as a stand-alone application at http://www.jstacs.de/index.php/Dispom.

摘要

转录因子是基因调控的主要组成部分，它们通过结合启动子中的特定结合位点来激活或抑制基因表达。通过湿实验获得的目标区域中转录因子结合位点的从头发现是计算生物学中的一个具有挑战性的问题，尚未得到完全解决。在这里，我们提出了一种称为 Dispom 的从头发现工具，用于发现差异丰富的转录因子结合位点，该工具模型了现有结合位点的位置偏好，并在学习过程中调整了 motif 的长度。通过评估 Dispom，我们发现它的预测性能优于现有的从头发现工具，对于 18 个具有种植结合位点的基准数据集，以及基于微阵列、ChIP-chip、ChIP-DSL 和 DamID 实验数据以及基因本体论数据的后生动物汇编，都是如此。最后，我们将 Dispom 应用于从拟南芥微阵列数据中提取的生长素响应基因启动子中差异丰富的结合位点的发现，我们找到了一个可以解释为主要位于转录起始位点上游 250bp 区域的精炼生长素反应元件的 motif。使用生长素响应基因的独立数据集，我们在全基因组预测中发现，与经典生长素反应元件相比，精炼 motif 对生长素响应基因更为特异。一般来说，Dispom 可以用于发现任何来源序列中差异丰富的 motif。然而，如果所有序列都像启动子序列那样对齐到某个锚点（如转录起始位点），那么 Dispom 学习到的位置分布尤其有益。我们证明，从数据中搜索差异丰富的 motif 和推断位置分布的组合有助于从头发现 motif。因此，我们将该工具作为开源 Java 框架 Jstacs 的一部分免费提供，并在 http://www.jstacs.de/index.php/Dispom 上作为独立应用程序提供。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ac93/3037384/c0d77cbe265a/pcbi.1001070.g001.jpg

相似文献

De-novo discovery of differentially abundant transcription factor binding sites including their positional preference.

PLoS Comput Biol. 2011 Feb 10;7(2):e1001070. doi: 10.1371/journal.pcbi.1001070.

Dispom: a discriminative de-novo motif discovery tool based on the jstacs library.

J Bioinform Comput Biol. 2013 Feb;11(1):1340006. doi: 10.1142/S0219720013400064. Epub 2013 Jan 21.

Positional distribution of transcription factor binding sites in Arabidopsis thaliana.

Sci Rep. 2016 Apr 27;6:25164. doi: 10.1038/srep25164.

Sequence-based prediction of transcription upregulation by auxin in plants.

J Bioinform Comput Biol. 2015 Feb;13(1):1540009. doi: 10.1142/S0219720015400090.

Prediction of auxin response elements based on data fusion in Arabidopsis thaliana.

Mol Biol Rep. 2018 Oct;45(5):763-772. doi: 10.1007/s11033-018-4216-6. Epub 2018 Jun 23.

AthaMap-assisted transcription factor target gene identification in Arabidopsis thaliana.

Database (Oxford). 2010 Dec 21;2010:baq034. doi: 10.1093/database/baq034. Print 2010.

Athena: a resource for rapid visualization and systematic analysis of Arabidopsis promoter sequences.

Bioinformatics. 2005 Dec 15;21(24):4411-3. doi: 10.1093/bioinformatics/bti714. Epub 2005 Oct 13.

Bioinformatic cis-element analyses performed in Arabidopsis and rice disclose bZIP- and MYB-related binding sites as potential AuxRE-coupling elements in auxin-mediated transcription.

BMC Plant Biol. 2012 Aug 1;12:125. doi: 10.1186/1471-2229-12-125.

Hypermethylation of Auxin-Responsive Motifs in the Promoters of the Transcription Factor Genes Accompanies the Somatic Embryogenesis Induction in Arabidopsis.

Int J Mol Sci. 2020 Sep 18;21(18):6849. doi: 10.3390/ijms21186849.

Architecture of DNA elements mediating ARF transcription factor binding and auxin-responsive gene expression in .

Proc Natl Acad Sci U S A. 2020 Sep 29;117(39):24557-24566. doi: 10.1073/pnas.2009554117. Epub 2020 Sep 14.

引用本文的文献

Identification of a 301 bp promoter core region of the SrUGT91D2 gene from Stevia rebaudiana that contributes to hormone and abiotic stress inducibility.

BMC Plant Biol. 2024 Oct 3;24(1):921. doi: 10.1186/s12870-024-05616-1.

Genomic background sequences systematically outperform synthetic ones in de novo motif discovery for ChIP-seq data.

NAR Genom Bioinform. 2024 Jul 27;6(3):lqae090. doi: 10.1093/nargab/lqae090. eCollection 2024 Sep.

Methyl Jasmonate Activates the Gene and Stimulates Tanshinone Accumulation in Solid Callus Cultures.

Molecules. 2022 Mar 8;27(6):1772. doi: 10.3390/molecules27061772.

Identification of Two Auxin-Regulated Potassium Transporters Involved in Seed Maturation.

Int J Mol Sci. 2018 Jul 22;19(7):2132. doi: 10.3390/ijms19072132.

Responses to auxin signals: an operating principle for dynamical sensitivity yet high resilience.

R Soc Open Sci. 2018 Jan 24;5(1):172098. doi: 10.1098/rsos.172098. eCollection 2018 Jan.

Combining phylogenetic footprinting with motif models incorporating intra-motif dependencies.

BMC Bioinformatics. 2017 Mar 1;18(1):141. doi: 10.1186/s12859-017-1495-1.

The orientation of transcription factor binding site motifs in gene promoter regions: does it matter?

BMC Genomics. 2016 Mar 3;17:185. doi: 10.1186/s12864-016-2549-x.

Inferring intra-motif dependencies of DNA binding sites from ChIP-seq data.

BMC Bioinformatics. 2015 Nov 9;16:375. doi: 10.1186/s12859-015-0797-4.

Varying levels of complexity in transcription factor binding motifs.

Nucleic Acids Res. 2015 Oct 15;43(18):e119. doi: 10.1093/nar/gkv577. Epub 2015 Jun 26.

Computational analysis of auxin responsive elements in the Arabidopsis thaliana L. genome.

BMC Genomics. 2014;15 Suppl 12(Suppl 12):S4. doi: 10.1186/1471-2164-15-S12-S4. Epub 2014 Dec 19.

本文引用的文献

Comprehensive transcriptome analysis of auxin responses in Arabidopsis.

Mol Plant. 2008 Mar;1(2):321-37. doi: 10.1093/mp/ssm021. Epub 2008 Jan 29.

Finding sequence motifs with Bayesian models incorporating positional information: an application to transcription factor binding sites.

BMC Bioinformatics. 2008 Jun 4;9:262. doi: 10.1186/1471-2105-9-262.

Transcription factor and microRNA motif discovery: the Amadeus platform and a compendium of metazoan target sets.

Genome Res. 2008 Jul;18(7):1180-9. doi: 10.1101/gr.076117.108. Epub 2008 Apr 14.

Accurate splice site prediction using support vector machines.

BMC Bioinformatics. 2007;8 Suppl 10(Suppl 10):S7. doi: 10.1186/1471-2105-8-S10-S7.

JASPAR, the open access database of transcription factor-binding profiles: new content and tools in the 2008 update.

Nucleic Acids Res. 2008 Jan;36(Database issue):D102-6. doi: 10.1093/nar/gkm955. Epub 2007 Nov 15.

A universal framework for regulatory element discovery across all genomes and data types.

Mol Cell. 2007 Oct 26;28(2):337-50. doi: 10.1016/j.molcel.2007.09.027.

Discriminative motif discovery in DNA and protein sequences using the DEME algorithm.

BMC Bioinformatics. 2007 Oct 15;8:385. doi: 10.1186/1471-2105-8-385.

Auxin response factors.

Curr Opin Plant Biol. 2007 Oct;10(5):453-60. doi: 10.1016/j.pbi.2007.08.014. Epub 2007 Sep 27.

Electrophoretic mobility shift assay (EMSA) for detecting protein-nucleic acid interactions.

Nat Protoc. 2007;2(8):1849-61. doi: 10.1038/nprot.2007.249.

Improved benchmarks for computational motif discovery.

BMC Bioinformatics. 2007 Jun 8;8:193. doi: 10.1186/1471-2105-8-193.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

从头发现差异丰度转录因子结合位点，包括其位置偏好。

De-novo discovery of differentially abundant transcription factor binding sites including their positional preference.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献