转录因子结合基序的复杂程度不同。

Varying levels of complexity in transcription factor binding motifs.

机构信息

Institute for Biosafety in Plant Biotechnology, Julius Kühn-Institut (JKI) - Federal Research Centre for Cultivated Plants, D-06484 Quedlinburg, Germany

Institute of Computer Science, Martin Luther University Halle-Wittenberg, D-06099 Halle (Saale), Germany.

出版信息

Nucleic Acids Res. 2015 Oct 15;43(18):e119. doi: 10.1093/nar/gkv577. Epub 2015 Jun 26.

DOI:10.1093/nar/gkv577

PMID:26116565

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4605289/

Abstract

Binding of transcription factors to DNA is one of the keystones of gene regulation. The existence of statistical dependencies between binding site positions is widely accepted, while their relevance for computational predictions has been debated. Building probabilistic models of binding sites that may capture dependencies is still challenging, since the most successful motif discovery approaches require numerical optimization techniques, which are not suited for selecting dependency structures. To overcome this issue, we propose sparse local inhomogeneous mixture (Slim) models that combine putative dependency structures in a weighted manner allowing for numerical optimization of dependency structure and model parameters simultaneously. We find that Slim models yield a substantially better prediction performance than previous models on genomic context protein binding microarray data sets and on ChIP-seq data sets. To elucidate the reasons for the improved performance, we develop dependency logos, which allow for visual inspection of dependency structures within binding sites. We find that the dependency structures discovered by Slim models are highly diverse and highly transcription factor-specific, which emphasizes the need for flexible dependency models. The observed dependency structures range from broad heterogeneities to sparse dependencies between neighboring and non-neighboring binding site positions.

摘要

转录因子与 DNA 的结合是基因调控的关键之一。尽管广泛接受了结合位点位置之间存在统计依赖性，但它们对计算预测的相关性仍存在争议。构建能够捕捉依赖性的结合位点概率模型仍然具有挑战性，因为最成功的基序发现方法需要数值优化技术，而这些技术不适合选择依赖结构。为了克服这个问题，我们提出了稀疏局部非均匀混合（Slim）模型，该模型以加权的方式组合了假定的依赖结构，允许同时对依赖结构和模型参数进行数值优化。我们发现，Slim 模型在基因组上下文蛋白结合微阵列数据集和 ChIP-seq 数据集上的预测性能明显优于以前的模型。为了阐明性能提高的原因，我们开发了依赖 logo，它允许在结合位点内可视化检查依赖结构。我们发现，Slim 模型发现的依赖结构非常多样化，并且高度特定于转录因子，这强调了需要灵活的依赖模型。观察到的依赖结构范围从广泛的异质性到相邻和非相邻结合位点位置之间的稀疏依赖性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1792/4605289/9bed2853c903/gkv577fig1.jpg

相似文献

Varying levels of complexity in transcription factor binding motifs.

Nucleic Acids Res. 2015 Oct 15;43(18):e119. doi: 10.1093/nar/gkv577. Epub 2015 Jun 26.

Automated incorporation of pairwise dependency in transcription factor binding site prediction using dinucleotide weight tensors.

PLoS Comput Biol. 2017 Jul 28;13(7):e1005176. doi: 10.1371/journal.pcbi.1005176. eCollection 2017 Jul.

Inferring intra-motif dependencies of DNA binding sites from ChIP-seq data.

BMC Bioinformatics. 2015 Nov 9;16:375. doi: 10.1186/s12859-015-0797-4.

A graph-based motif detection algorithm models complex nucleotide dependencies in transcription factor binding sites.

Nucleic Acids Res. 2006;34(20):5730-9. doi: 10.1093/nar/gkl585. Epub 2006 Oct 13.

MotifLab: a tools and data integration workbench for motif discovery and regulatory sequence analysis.

BMC Bioinformatics. 2013 Jan 16;14:9. doi: 10.1186/1471-2105-14-9.

An intuitionistic approach to scoring DNA sequences against transcription factor binding site motifs.

BMC Bioinformatics. 2010 Nov 8;11:551. doi: 10.1186/1471-2105-11-551.

New scoring schema for finding motifs in DNA Sequences.

BMC Bioinformatics. 2009 Mar 20;10:93. doi: 10.1186/1471-2105-10-93.

GSMC: Combining Parallel Gibbs Sampling with Maximal Cliques for Hunting DNA Motif.

J Comput Biol. 2017 Dec;24(12):1243-1253. doi: 10.1089/cmb.2017.0100. Epub 2017 Nov 8.

Combining phylogenetic footprinting with motif models incorporating intra-motif dependencies.

BMC Bioinformatics. 2017 Mar 1;18(1):141. doi: 10.1186/s12859-017-1495-1.

Probabilistic models for semisupervised discriminative motif discovery in DNA sequences.

IEEE/ACM Trans Comput Biol Bioinform. 2011 Sep-Oct;8(5):1309-17. doi: 10.1109/TCBB.2010.84.

引用本文的文献

Genomic background sequences systematically outperform synthetic ones in de novo motif discovery for ChIP-seq data.

NAR Genom Bioinform. 2024 Jul 27;6(3):lqae090. doi: 10.1093/nargab/lqae090. eCollection 2024 Sep.

A statistical approach for identifying single nucleotide variants that affect transcription factor binding.

iScience. 2024 Apr 18;27(5):109765. doi: 10.1016/j.isci.2024.109765. eCollection 2024 May 17.

Peak Scores Significantly Depend on the Relationships between Contextual Signals in ChIP-Seq Peaks.

Int J Mol Sci. 2024 Jan 13;25(2):1011. doi: 10.3390/ijms25021011.

Widespread effects of DNA methylation and intra-motif dependencies revealed by novel transcription factor binding models.

Nucleic Acids Res. 2023 Oct 13;51(18):e95. doi: 10.1093/nar/gkad693.

CVD-associated SNPs with regulatory potential reveal novel non-coding disease genes.

Hum Genomics. 2023 Jul 25;17(1):69. doi: 10.1186/s40246-023-00513-4.

Double DAP-seq uncovered synergistic DNA binding of interacting bZIP transcription factors.

Nat Commun. 2023 May 5;14(1):2600. doi: 10.1038/s41467-023-38096-2.

A survey on algorithms to characterize transcription factor binding sites.

Brief Bioinform. 2023 May 19;24(3). doi: 10.1093/bib/bbad156.

Motif models proposing independent and interdependent impacts of nucleotides are related to high and low affinity transcription factor binding sites in Arabidopsis.

Front Plant Sci. 2022 Jul 28;13:938545. doi: 10.3389/fpls.2022.938545. eCollection 2022.

Systematic Evaluation of DNA Sequence Variations on Transcription Factor Binding Affinity.

Front Genet. 2021 Sep 9;12:667866. doi: 10.3389/fgene.2021.667866. eCollection 2021.

Learning the Regulatory Code of Gene Expression.

Front Mol Biosci. 2021 Jun 10;8:673363. doi: 10.3389/fmolb.2021.673363. eCollection 2021.

本文引用的文献

Quantitative modeling of transcription factor binding specificities using DNA shape.

Proc Natl Acad Sci U S A. 2015 Apr 14;112(15):4654-9. doi: 10.1073/pnas.1422023112. Epub 2015 Mar 9.

Absence of a simple code: how transcription factors read the genome.

Trends Biochem Sci. 2014 Sep;39(9):381-99. doi: 10.1016/j.tibs.2014.07.002. Epub 2014 Aug 14.

On the value of intra-motif dependencies of human insulator protein CTCF.

PLoS One. 2014 Jan 22;9(1):e85629. doi: 10.1371/journal.pone.0085629. eCollection 2014.

c-Jun/c-Fos heterodimers regulate cellular genes via a newly identified class of methylated DNA sequence motifs.

Nucleic Acids Res. 2014 Mar;42(5):3059-72. doi: 10.1093/nar/gkt1323. Epub 2013 Dec 25.

Systematic discovery and characterization of regulatory motifs in ENCODE TF binding experiments.

Nucleic Acids Res. 2014 Mar;42(5):2976-87. doi: 10.1093/nar/gkt1249. Epub 2013 Dec 13.

TFBSshape: a motif database for DNA shape features of transcription factor binding sites.

Nucleic Acids Res. 2014 Jan;42(Database issue):D148-55. doi: 10.1093/nar/gkt1087. Epub 2013 Nov 7.

A general approach for discriminative de novo motif discovery from high-throughput data.

Nucleic Acids Res. 2013 Nov;41(21):e197. doi: 10.1093/nar/gkt831. Epub 2013 Sep 20.

The next generation of transcription factor binding site prediction.

PLoS Comput Biol. 2013;9(9):e1003214. doi: 10.1371/journal.pcbi.1003214. Epub 2013 Sep 5.

Stability selection for regression-based models of transcription factor-DNA binding specificity.

Bioinformatics. 2013 Jul 1;29(13):i117-25. doi: 10.1093/bioinformatics/btt221.

A genome-wide map of CTCF multivalency redefines the CTCF code.

Cell Rep. 2013 May 30;3(5):1678-1689. doi: 10.1016/j.celrep.2013.04.024. Epub 2013 May 23.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

转录因子结合基序的复杂程度不同。

Varying levels of complexity in transcription factor binding motifs.

机构信息

Institute for Biosafety in Plant Biotechnology, Julius Kühn-Institut (JKI) - Federal Research Centre for Cultivated Plants, D-06484 Quedlinburg, Germany

Institute of Computer Science, Martin Luther University Halle-Wittenberg, D-06099 Halle (Saale), Germany.

出版信息

Nucleic Acids Res. 2015 Oct 15;43(18):e119. doi: 10.1093/nar/gkv577. Epub 2015 Jun 26.

DOI:10.1093/nar/gkv577

PMID:26116565

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4605289/

Abstract

摘要

转录因子结合基序的复杂程度不同。

Varying levels of complexity in transcription factor binding motifs.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

转录因子结合基序的复杂程度不同。

Varying levels of complexity in transcription factor binding motifs.

机构信息

出版信息