Suppr超能文献

转录因子序列特异性建模方法评估。

Evaluation of methods for modeling transcription factor sequence specificity.

机构信息

Banting and Best Department of Medical Research and Donnelly Centre, University of Toronto, Toronto, Ontario, Canada.

出版信息

Nat Biotechnol. 2013 Feb;31(2):126-34. doi: 10.1038/nbt.2486. Epub 2013 Jan 27.

Abstract

Genomic analyses often involve scanning for potential transcription factor (TF) binding sites using models of the sequence specificity of DNA binding proteins. Many approaches have been developed to model and learn a protein's DNA-binding specificity, but these methods have not been systematically compared. Here we applied 26 such approaches to in vitro protein binding microarray data for 66 mouse TFs belonging to various families. For nine TFs, we also scored the resulting motif models on in vivo data, and found that the best in vitro-derived motifs performed similarly to motifs derived from the in vivo data. Our results indicate that simple models based on mononucleotide position weight matrices trained by the best methods perform similarly to more complex models for most TFs examined, but fall short in specific cases (<10% of the TFs examined here). In addition, the best-performing motifs typically have relatively low information content, consistent with widespread degeneracy in eukaryotic TF sequence preferences.

摘要

基因组分析通常涉及使用 DNA 结合蛋白序列特异性模型来扫描潜在的转录因子 (TF) 结合位点。已经开发了许多方法来对蛋白质的 DNA 结合特异性进行建模和学习,但这些方法尚未得到系统比较。在这里,我们应用了 26 种这样的方法对 66 种属于不同家族的小鼠 TF 的体外蛋白质结合微阵列数据进行了分析。对于 9 个 TF,我们还在体内数据上对得到的基序模型进行了评分,发现体外衍生的最佳基序模型与从体内数据得到的基序模型表现相似。我们的结果表明,对于大多数所检查的 TF,基于最佳方法训练的单核苷酸位置权重矩阵的简单模型与更复杂的模型表现相似,但在特定情况下(<10%的所检查的 TF)表现不佳。此外,表现最好的基序通常具有相对较低的信息含量,这与真核生物 TF 序列偏好中的广泛简并性一致。

相似文献

5
Transcription factor-DNA binding: beyond binding site motifs.转录因子与DNA结合:超越结合位点基序
Curr Opin Genet Dev. 2017 Apr;43:110-119. doi: 10.1016/j.gde.2017.02.007. Epub 2017 Mar 27.

引用本文的文献

本文引用的文献

4
The UCSC Genome Browser database: extensions and updates 2011.UCSC 基因组浏览器数据库:扩展和更新 2011 年版。
Nucleic Acids Res. 2012 Jan;40(Database issue):D918-23. doi: 10.1093/nar/gkr1055. Epub 2011 Nov 15.
8
MEME-ChIP: motif analysis of large DNA datasets.MEME-ChIP:大 DNA 数据集的基序分析。
Bioinformatics. 2011 Jun 15;27(12):1696-7. doi: 10.1093/bioinformatics/btr189. Epub 2011 Apr 12.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验