Suppr超能文献

转录因子序列特异性建模方法评估。

Evaluation of methods for modeling transcription factor sequence specificity.

机构信息

Banting and Best Department of Medical Research and Donnelly Centre, University of Toronto, Toronto, Ontario, Canada.

出版信息

Nat Biotechnol. 2013 Feb;31(2):126-34. doi: 10.1038/nbt.2486. Epub 2013 Jan 27.

Abstract

Genomic analyses often involve scanning for potential transcription factor (TF) binding sites using models of the sequence specificity of DNA binding proteins. Many approaches have been developed to model and learn a protein's DNA-binding specificity, but these methods have not been systematically compared. Here we applied 26 such approaches to in vitro protein binding microarray data for 66 mouse TFs belonging to various families. For nine TFs, we also scored the resulting motif models on in vivo data, and found that the best in vitro-derived motifs performed similarly to motifs derived from the in vivo data. Our results indicate that simple models based on mononucleotide position weight matrices trained by the best methods perform similarly to more complex models for most TFs examined, but fall short in specific cases (<10% of the TFs examined here). In addition, the best-performing motifs typically have relatively low information content, consistent with widespread degeneracy in eukaryotic TF sequence preferences.

摘要

基因组分析通常涉及使用 DNA 结合蛋白序列特异性模型来扫描潜在的转录因子 (TF) 结合位点。已经开发了许多方法来对蛋白质的 DNA 结合特异性进行建模和学习,但这些方法尚未得到系统比较。在这里,我们应用了 26 种这样的方法对 66 种属于不同家族的小鼠 TF 的体外蛋白质结合微阵列数据进行了分析。对于 9 个 TF,我们还在体内数据上对得到的基序模型进行了评分,发现体外衍生的最佳基序模型与从体内数据得到的基序模型表现相似。我们的结果表明,对于大多数所检查的 TF,基于最佳方法训练的单核苷酸位置权重矩阵的简单模型与更复杂的模型表现相似,但在特定情况下(<10%的所检查的 TF)表现不佳。此外,表现最好的基序通常具有相对较低的信息含量,这与真核生物 TF 序列偏好中的广泛简并性一致。

相似文献

1
Evaluation of methods for modeling transcription factor sequence specificity.
Nat Biotechnol. 2013 Feb;31(2):126-34. doi: 10.1038/nbt.2486. Epub 2013 Jan 27.
2
Optimally choosing PWM motif databases and sequence scanning approaches based on ChIP-seq data.
BMC Bioinformatics. 2015 May 1;16:140. doi: 10.1186/s12859-015-0573-5.
3
High resolution models of transcription factor-DNA affinities improve in vitro and in vivo binding predictions.
PLoS Comput Biol. 2010 Sep 9;6(9):e1000916. doi: 10.1371/journal.pcbi.1000916.
4
Nonconsensus Protein Binding to Repetitive DNA Sequence Elements Significantly Affects Eukaryotic Genomes.
PLoS Comput Biol. 2015 Aug 18;11(8):e1004429. doi: 10.1371/journal.pcbi.1004429. eCollection 2015 Aug.
5
Transcription factor-DNA binding: beyond binding site motifs.
Curr Opin Genet Dev. 2017 Apr;43:110-119. doi: 10.1016/j.gde.2017.02.007. Epub 2017 Mar 27.
7
Contribution of Sequence Motif, Chromatin State, and DNA Structure Features to Predictive Models of Transcription Factor Binding in Yeast.
PLoS Comput Biol. 2015 Aug 20;11(8):e1004418. doi: 10.1371/journal.pcbi.1004418. eCollection 2015 Aug.
10
Inferring intra-motif dependencies of DNA binding sites from ChIP-seq data.
BMC Bioinformatics. 2015 Nov 9;16:375. doi: 10.1186/s12859-015-0797-4.

引用本文的文献

2
analysis of promoters predicts conserved and human specific regulators of adipocyte thermogenesis.
iScience. 2025 Jun 21;28(7):112969. doi: 10.1016/j.isci.2025.112969. eCollection 2025 Jul 18.
3
Interpretable protein-DNA interactions captured by structure-sequence optimization.
Elife. 2025 Jul 17;14:RP105565. doi: 10.7554/eLife.105565.
4
OptimDase: An Algorithm for Predicting DNA Binding Sites with Combined Feature Encoding.
Interdiscip Sci. 2025 Jun 10. doi: 10.1007/s12539-025-00704-8.
5
Exploring the complexity of MECP2 function in Rett syndrome.
Nat Rev Neurosci. 2025 May 13. doi: 10.1038/s41583-025-00926-1.
10
Predicting CTCF cell type active binding sites in human genome.
Sci Rep. 2024 Dec 30;14(1):31744. doi: 10.1038/s41598-024-82238-5.

本文引用的文献

1
Improved models for transcription factor binding site identification using nonindependent interactions.
Genetics. 2012 Jul;191(3):781-90. doi: 10.1534/genetics.112.138685. Epub 2012 Apr 13.
2
Comprehensive genome-wide protein-DNA interactions detected at single-nucleotide resolution.
Cell. 2011 Dec 9;147(6):1408-19. doi: 10.1016/j.cell.2011.11.013.
3
Cofactor binding evokes latent differences in DNA binding specificity between Hox proteins.
Cell. 2011 Dec 9;147(6):1270-82. doi: 10.1016/j.cell.2011.10.053.
4
The UCSC Genome Browser database: extensions and updates 2011.
Nucleic Acids Res. 2012 Jan;40(Database issue):D918-23. doi: 10.1093/nar/gkr1055. Epub 2011 Nov 15.
5
Direct measurement of DNA affinity landscapes on a high-throughput sequencing instrument.
Nat Biotechnol. 2011 Jun 26;29(7):659-64. doi: 10.1038/nbt.1882.
7
A linear model for transcription factor binding affinity prediction in protein binding microarrays.
PLoS One. 2011;6(5):e20059. doi: 10.1371/journal.pone.0020059. Epub 2011 May 26.
8
MEME-ChIP: motif analysis of large DNA datasets.
Bioinformatics. 2011 Jun 15;27(12):1696-7. doi: 10.1093/bioinformatics/btr189. Epub 2011 Apr 12.
9
hmChIP: a database and web server for exploring publicly available human and mouse ChIP-seq and ChIP-chip data.
Bioinformatics. 2011 May 15;27(10):1447-8. doi: 10.1093/bioinformatics/btr156. Epub 2011 Mar 30.
10
De-novo discovery of differentially abundant transcription factor binding sites including their positional preference.
PLoS Comput Biol. 2011 Feb 10;7(2):e1001070. doi: 10.1371/journal.pcbi.1001070.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验