• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于多数据源的转录因子结合概率推断

Probabilistic inference of transcription factor binding from multiple data sources.

作者信息

Lähdesmäki Harri, Rust Alistair G, Shmulevich Ilya

机构信息

Institute for Systems Biology, Seattle, Washington, United States of America.

出版信息

PLoS One. 2008 Mar 26;3(3):e1820. doi: 10.1371/journal.pone.0001820.

DOI:10.1371/journal.pone.0001820
PMID:18364997
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2268002/
Abstract

An important problem in molecular biology is to build a complete understanding of transcriptional regulatory processes in the cell. We have developed a flexible, probabilistic framework to predict TF binding from multiple data sources that differs from the standard hypothesis testing (scanning) methods in several ways. Our probabilistic modeling framework estimates the probability of binding and, thus, naturally reflects our degree of belief in binding. Probabilistic modeling also allows for easy and systematic integration of our binding predictions into other probabilistic modeling methods, such as expression-based gene network inference. The method answers the question of whether the whole analyzed promoter has a binding site, but can also be extended to estimate the binding probability at each nucleotide position. Further, we introduce an extension to model combinatorial regulation by several TFs. Most importantly, the proposed methods can make principled probabilistic inference from multiple evidence sources, such as, multiple statistical models (motifs) of the TFs, evolutionary conservation, regulatory potential, CpG islands, nucleosome positioning, DNase hypersensitive sites, ChIP-chip binding segments and other (prior) sequence-based biological knowledge. We developed both a likelihood and a Bayesian method, where the latter is implemented with a Markov chain Monte Carlo algorithm. Results on a carefully constructed test set from the mouse genome demonstrate that principled data fusion can significantly improve the performance of TF binding prediction methods. We also applied the probabilistic modeling framework to all promoters in the mouse genome and the results indicate a sparse connectivity between transcriptional regulators and their target promoters. To facilitate analysis of other sequences and additional data, we have developed an on-line web tool, ProbTF, which implements our probabilistic TF binding prediction method using multiple data sources. Test data set, a web tool, source codes and supplementary data are available at: http://www.probtf.org.

摘要

分子生物学中的一个重要问题是全面了解细胞中的转录调控过程。我们开发了一个灵活的概率框架,用于从多个数据源预测转录因子(TF)结合,该框架在几个方面不同于标准的假设检验(扫描)方法。我们的概率建模框架估计结合概率,因此自然地反映了我们对结合的置信程度。概率建模还允许将我们的结合预测轻松、系统地整合到其他概率建模方法中,例如基于表达的基因网络推断。该方法回答了整个分析的启动子是否具有结合位点的问题,但也可以扩展以估计每个核苷酸位置的结合概率。此外,我们引入了一种扩展,用于对多个转录因子的组合调控进行建模。最重要的是,所提出的方法可以从多个证据来源进行有原则的概率推断,例如转录因子的多个统计模型(模体)、进化保守性、调控潜力、CpG岛、核小体定位、DNase超敏位点、ChIP-chip结合片段以及其他(先验)基于序列的生物学知识。我们开发了似然法和贝叶斯方法,后者通过马尔可夫链蒙特卡罗算法实现。来自小鼠基因组的精心构建的测试集上的结果表明,有原则的数据融合可以显著提高转录因子结合预测方法的性能。我们还将概率建模框架应用于小鼠基因组中的所有启动子,结果表明转录调节因子与其靶启动子之间的连接稀疏。为了便于分析其他序列和更多数据,我们开发了一个在线网络工具ProbTF,它使用多个数据源实现了我们的概率转录因子结合预测方法。测试数据集、网络工具、源代码和补充数据可在以下网址获取:http://www.probtf.org。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d72/2268002/b27bbaec7da6/pone.0001820.g013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d72/2268002/b387f9528e78/pone.0001820.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d72/2268002/5599e39e576f/pone.0001820.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d72/2268002/3de7e1947a77/pone.0001820.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d72/2268002/aa8185c457f3/pone.0001820.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d72/2268002/37dfbbfa7d25/pone.0001820.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d72/2268002/65e52f3bc608/pone.0001820.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d72/2268002/576744a6f436/pone.0001820.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d72/2268002/976c586e55db/pone.0001820.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d72/2268002/5dc35ddc6e66/pone.0001820.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d72/2268002/27c2a6689e44/pone.0001820.g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d72/2268002/cfd72503f89b/pone.0001820.g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d72/2268002/ba6642730df6/pone.0001820.g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d72/2268002/b27bbaec7da6/pone.0001820.g013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d72/2268002/b387f9528e78/pone.0001820.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d72/2268002/5599e39e576f/pone.0001820.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d72/2268002/3de7e1947a77/pone.0001820.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d72/2268002/aa8185c457f3/pone.0001820.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d72/2268002/37dfbbfa7d25/pone.0001820.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d72/2268002/65e52f3bc608/pone.0001820.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d72/2268002/576744a6f436/pone.0001820.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d72/2268002/976c586e55db/pone.0001820.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d72/2268002/5dc35ddc6e66/pone.0001820.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d72/2268002/27c2a6689e44/pone.0001820.g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d72/2268002/cfd72503f89b/pone.0001820.g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d72/2268002/ba6642730df6/pone.0001820.g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d72/2268002/b27bbaec7da6/pone.0001820.g013.jpg

相似文献

1
Probabilistic inference of transcription factor binding from multiple data sources.基于多数据源的转录因子结合概率推断
PLoS One. 2008 Mar 26;3(3):e1820. doi: 10.1371/journal.pone.0001820.
2
PhyloGibbs: a Gibbs sampling motif finder that incorporates phylogeny.PhyloGibbs:一种整合了系统发育的吉布斯采样基序查找器。
PLoS Comput Biol. 2005 Dec;1(7):e67. doi: 10.1371/journal.pcbi.0010067. Epub 2005 Dec 9.
3
Efficient inference for sparse latent variable models of transcriptional regulation.转录调控稀疏潜在变量模型的高效推断。
Bioinformatics. 2017 Dec 1;33(23):3776-3783. doi: 10.1093/bioinformatics/btx508.
4
The next generation of transcription factor binding site prediction.下一代转录因子结合位点预测。
PLoS Comput Biol. 2013;9(9):e1003214. doi: 10.1371/journal.pcbi.1003214. Epub 2013 Sep 5.
5
MotEvo: integrated Bayesian probabilistic methods for inferring regulatory sites and motifs on multiple alignments of DNA sequences.MotEvo:一种用于在 DNA 序列多重比对上推断调控位点和基序的集成贝叶斯概率方法。
Bioinformatics. 2012 Feb 15;28(4):487-94. doi: 10.1093/bioinformatics/btr695.
6
OHMM: a Hidden Markov Model accurately predicting the occupancy of a transcription factor with a self-overlapping binding motif.OHMM:一种隐马尔可夫模型,可准确预测具有自重叠结合基序的转录因子的占有率。
BMC Bioinformatics. 2009 Jul 7;10:208. doi: 10.1186/1471-2105-10-208.
7
Bayesian Markov Random Field analysis for protein function prediction based on network data.基于网络数据的蛋白质功能预测的贝叶斯马尔可夫随机场分析。
PLoS One. 2010 Feb 24;5(2):e9293. doi: 10.1371/journal.pone.0009293.
8
BinDNase: a discriminatory approach for transcription factor binding prediction using DNase I hypersensitivity data.BinDNase:一种利用DNA酶I超敏反应数据进行转录因子结合预测的鉴别方法。
Bioinformatics. 2015 Sep 1;31(17):2852-9. doi: 10.1093/bioinformatics/btv294. Epub 2015 May 7.
9
Bayesian probabilistic network modeling from multiple independent replicates.从多个独立重复中进行贝叶斯概率网络建模。
BMC Bioinformatics. 2012 Jun 11;13 Suppl 9(Suppl 9):S6. doi: 10.1186/1471-2105-13-S9-S6.
10
Network motif-based identification of transcription factor-target gene relationships by integrating multi-source biological data.通过整合多源生物数据基于网络基序识别转录因子-靶基因关系
BMC Bioinformatics. 2008 Apr 21;9:203. doi: 10.1186/1471-2105-9-203.

引用本文的文献

1
Benefiting from the intrinsic role of epigenetics to predict patterns of CTCF binding.受益于表观遗传学在预测CTCF结合模式方面的内在作用。
Comput Struct Biotechnol J. 2023 May 12;21:3024-3031. doi: 10.1016/j.csbj.2023.05.012. eCollection 2023.
2
Dempster-Shafer Theory for the Prediction of Auxin-Response Elements (AuxREs) in Plant Genomes.Dempster-Shafer 理论在植物基因组中预测生长素响应元件 (AuxREs) 的应用。
Biomed Res Int. 2018 Nov 1;2018:3837060. doi: 10.1155/2018/3837060. eCollection 2018.
3
An Empirical Prior Improves Accuracy for Bayesian Estimation of Transcription Factor Binding Site Frequencies within Gene Promoters.

本文引用的文献

1
A nucleosome-guided map of transcription factor binding sites in yeast.酵母中转录因子结合位点的核小体引导图谱。
PLoS Comput Biol. 2007 Nov;3(11):e215. doi: 10.1371/journal.pcbi.0030215. Epub 2007 Sep 24.
2
SP1 transcription factors in male germ cell development and differentiation.SP1转录因子在雄性生殖细胞发育与分化中的作用
Mol Cell Endocrinol. 2007 May 30;270(1-2):1-7. doi: 10.1016/j.mce.2007.03.001. Epub 2007 Mar 12.
3
Quantifying DNA-protein binding specificities by using oligonucleotide mass tags and mass spectroscopy.
经验先验提高了基因启动子中转录因子结合位点频率贝叶斯估计的准确性。
Bioinform Biol Insights. 2016 Oct 25;9(Suppl 4):59-69. doi: 10.4137/BBI.S29330. eCollection 2015.
4
Methodology for single nucleotide polymorphism selection in promoter regions for clinical use. An example of its applicability.临床应用中启动子区域单核苷酸多态性选择的方法学。其适用性示例。
Int J Mol Epidemiol Genet. 2016 Sep 30;7(3):126-136. eCollection 2016.
5
An Overview of NCA-Based Algorithms for Transcriptional Regulatory Network Inference.基于归一化互相关的转录调控网络推断算法综述
Microarrays (Basel). 2015 Nov 16;4(4):596-617. doi: 10.3390/microarrays4040596.
6
A DNA shape-based regulatory score improves position-weight matrix-based recognition of transcription factor binding sites.一种基于DNA形状的调控评分提高了基于位置权重矩阵对转录因子结合位点的识别。
Bioinformatics. 2015 Nov 1;31(21):3445-50. doi: 10.1093/bioinformatics/btv391. Epub 2015 Jun 30.
7
Compound hierarchical correlated beta mixture with an application to cluster mouse transcription factor DNA binding data.用于聚类小鼠转录因子DNA结合数据的复合层次相关贝塔混合模型
Biostatistics. 2015 Oct;16(4):641-54. doi: 10.1093/biostatistics/kxv016. Epub 2015 May 11.
8
Integrating diverse datasets improves developmental enhancer prediction.整合多种数据集可提高发育增强子预测的准确性。
PLoS Comput Biol. 2014 Jun 26;10(6):e1003677. doi: 10.1371/journal.pcbi.1003677. eCollection 2014 Jun.
9
Transcription factor binding sites prediction based on modified nucleosomes.基于修饰核小体的转录因子结合位点预测
PLoS One. 2014 Feb 21;9(2):e89226. doi: 10.1371/journal.pone.0089226. eCollection 2014.
10
Inferring functional transcription factor-gene binding pairs by integrating transcription factor binding data with transcription factor knockout data.通过整合转录因子结合数据与转录因子敲除数据推断功能性转录因子-基因结合对。
BMC Syst Biol. 2013;7 Suppl 6(Suppl 6):S13. doi: 10.1186/1752-0509-7-S6-S13. Epub 2013 Dec 13.
利用寡核苷酸质量标签和质谱法对DNA-蛋白质结合特异性进行定量分析。
Proc Natl Acad Sci U S A. 2007 Feb 27;104(9):3061-6. doi: 10.1073/pnas.0611075104. Epub 2007 Feb 20.
4
A systems approach to measuring the binding energy landscapes of transcription factors.一种用于测量转录因子结合能景观的系统方法。
Science. 2007 Jan 12;315(5809):233-7. doi: 10.1126/science.1131007.
5
Clustering of genes into regulons using integrated modeling-COGRIM.使用综合建模-COGRIM将基因聚类成调控子。
Genome Biol. 2007;8(1):R4. doi: 10.1186/gb-2007-8-1-r4.
6
The gateway to transcription: identifying, characterizing and understanding promoters in the eukaryotic genome.转录的门户:鉴定、表征和理解真核生物基因组中的启动子
Cell Mol Life Sci. 2007 Feb;64(4):386-400. doi: 10.1007/s00018-006-6295-0.
7
Computational framework for the prediction of transcription factor binding sites by multiple data integration.基于多数据源整合的转录因子结合位点预测计算框架
BMC Neurosci. 2006 Oct 30;7 Suppl 1(Suppl 1):S8. doi: 10.1186/1471-2202-7-S1-S8.
8
ESPERR: learning strong and weak signals in genomic sequence alignments to identify functional elements.ESPERR:在基因组序列比对中学习强弱信号以识别功能元件。
Genome Res. 2006 Dec;16(12):1596-604. doi: 10.1101/gr.4537706. Epub 2006 Oct 19.
9
Compact, universal DNA microarrays to comprehensively determine transcription-factor binding site specificities.用于全面确定转录因子结合位点特异性的紧凑型通用DNA微阵列。
Nat Biotechnol. 2006 Nov;24(11):1429-35. doi: 10.1038/nbt1246. Epub 2006 Sep 24.
10
High-resolution computational models of genome binding events.基因组结合事件的高分辨率计算模型。
Nat Biotechnol. 2006 Aug;24(8):963-70. doi: 10.1038/nbt1233.