• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

亲和力分布的数学建模以及从全基因组结合谱估计转录因子的一般结合特性

Mathematical Modeling of Avidity Distribution and Estimating General Binding Properties of Transcription Factors from Genome-Wide Binding Profiles.

作者信息

Kuznetsov Vladimir A

机构信息

Bioinformatics Institute, Agency of Science, Technology and Research, 30 Biopolis Street, #07-01 Matrix, Singapore, 138671, Singapore.

School of Computer Science and Engineering, Nanyang Technological University, Singapore, 639798, Singapore.

出版信息

Methods Mol Biol. 2017;1613:193-276. doi: 10.1007/978-1-4939-7027-8_9.

DOI:10.1007/978-1-4939-7027-8_9
PMID:28849563
Abstract

The shape of the experimental frequency distributions (EFD) of diverse molecular interaction events quantifying genome-wide binding is often skewed to the rare but abundant quantities. Such distributions are systematically deviated from standard power-law functions proposed by scale-free network models suggesting that more explanatory and predictive probabilistic model(s) are needed. Identification of the mechanism-based data-driven statistical distributions that provide an estimation and prediction of binding properties of transcription factors from genome-wide binding profiles is the goal of this analytical survey. Here, we review and develop an analytical framework for modeling, analysis, and prediction of transcription factor (TF) DNA binding properties detected at the genome scale. We introduce a mixture probabilistic model of binding avidity function that includes nonspecific and specific binding events. A method for decomposition of specific and nonspecific TF-DNA binding events is proposed. We show that the Kolmogorov-Waring (KW) probability function (PF), modeling the steady state TF binding-dissociation stochastic process, fits well with the EFD for diverse TF-DNA binding datasets. Furthermore, this distribution predicts total number of TF-DNA binding sites (BSs), estimating specificity and sensitivity as well as other basic statistical features of DNA-TF binding when the experimental datasets are noise-rich and essentially incomplete. The KW distribution fits equally well to TF-DNA binding activity for different TFs including ERE, CREB, STAT1, Nanog, and Oct4. Our analysis reveals that the KW distribution and its generalized form provides the family of power-law-like distributions given in terms of hypergeometric series functions, including standard and generalized Pareto and Waring distributions, providing flexible and common skewed forms of the transcription factor binding site (TFBS) avidity distribution function. We suggest that the skewed binding events may be due to a wide range of evolutionary processes of creating weak avidity TFBS associated with random mutations, while the rare high-avidity binding sites (i.e., high-avidity evolutionarily conserved canonical e-boxes) rarely occurred. These, however, may be positively selected in microevolution.

摘要

量化全基因组结合的各种分子相互作用事件的实验频率分布(EFD)形状,往往偏向于数量稀少但出现频率高的情况。这种分布系统地偏离了无标度网络模型提出的标准幂律函数,这表明需要更多具有解释力和预测性的概率模型。从全基因组结合谱中识别基于机制的数据驱动统计分布,以估计和预测转录因子的结合特性,是本次分析研究的目标。在此,我们回顾并开发了一个用于建模、分析和预测在基因组规模上检测到的转录因子(TF)DNA结合特性的分析框架。我们引入了一个结合亲和力函数的混合概率模型,该模型包括非特异性和特异性结合事件。提出了一种分解特异性和非特异性TF-DNA结合事件的方法。我们表明,对稳态TF结合-解离随机过程进行建模的Kolmogorov-Waring(KW)概率函数(PF),与各种TF-DNA结合数据集的EFD拟合良好。此外,当实验数据集噪声丰富且基本不完整时,这种分布可以预测TF-DNA结合位点(BS)的总数,估计特异性和敏感性以及DNA-TF结合的其他基本统计特征。KW分布对包括雌激素反应元件(ERE)、环磷腺苷效应元件结合蛋白(CREB)、信号转导和转录激活因子1(STAT1)、Nanog和八聚体结合转录因子4(Oct4)在内的不同TF的TF-DNA结合活性拟合效果同样良好。我们的分析表明,KW分布及其广义形式提供了一类以超几何级数函数表示的幂律样分布,包括标准和广义帕累托分布以及Waring分布,提供了转录因子结合位点(TFBS)亲和力分布函数灵活且常见的偏态形式。我们认为,偏态结合事件可能是由于与随机突变相关的创建弱亲和力TFBS的广泛进化过程导致的,而罕见的高亲和力结合位点(即高亲和力进化保守的典型e盒)很少出现。然而,这些位点可能在微观进化中受到正选择。

相似文献

1
Mathematical Modeling of Avidity Distribution and Estimating General Binding Properties of Transcription Factors from Genome-Wide Binding Profiles.亲和力分布的数学建模以及从全基因组结合谱估计转录因子的一般结合特性
Methods Mol Biol. 2017;1613:193-276. doi: 10.1007/978-1-4939-7027-8_9.
2
Relative avidity, specificity, and sensitivity of transcription factor-DNA binding in genome-scale experiments.基因组规模实验中转录因子与DNA结合的相对亲和力、特异性和灵敏度。
Methods Mol Biol. 2009;563:15-50. doi: 10.1007/978-1-60761-175-2_2.
3
Statistics of protein-DNA binding and the total number of binding sites for a transcription factor in the mammalian genome.蛋白质-DNA 结合的统计数据和哺乳动物基因组中转录因子的总结合位点数量。
BMC Genomics. 2010 Feb 10;11 Suppl 1(Suppl 1):S12. doi: 10.1186/1471-2164-11-S1-S12.
4
Computational analysis and modeling of genome-scale avidity distribution of transcription factor binding sites in chip-pet experiments.芯片-染色质沉淀实验中转录因子结合位点全基因组亲和力分布的计算分析与建模
Genome Inform. 2007;19:83-94.
5
Assessing the model transferability for prediction of transcription factor binding sites based on chromatin accessibility.基于染色质可及性评估预测转录因子结合位点的模型可转移性。
BMC Bioinformatics. 2017 Jul 27;18(1):355. doi: 10.1186/s12859-017-1769-7.
6
Molecular and structural considerations of TF-DNA binding for the generation of biologically meaningful and accurate phylogenetic footprinting analysis: the LysR-type transcriptional regulator family as a study model.用于生成具有生物学意义和准确的系统发育足迹分析的TF-DNA结合的分子和结构考量:以LysR型转录调节因子家族作为研究模型
BMC Genomics. 2016 Aug 27;17(1):686. doi: 10.1186/s12864-016-3025-3.
7
Nonconsensus Protein Binding to Repetitive DNA Sequence Elements Significantly Affects Eukaryotic Genomes.与重复DNA序列元件的非一致性蛋白质结合显著影响真核生物基因组。
PLoS Comput Biol. 2015 Aug 18;11(8):e1004429. doi: 10.1371/journal.pcbi.1004429. eCollection 2015 Aug.
8
A biophysical model for analysis of transcription factor interaction and binding site arrangement from genome-wide binding data.基于全基因组结合数据的转录因子相互作用和结合位点排列的生物物理模型分析。
PLoS One. 2009 Dec 1;4(12):e8155. doi: 10.1371/journal.pone.0008155.
9
Quantitative modeling of transcription factor binding specificities using DNA shape.利用DNA形状对转录因子结合特异性进行定量建模。
Proc Natl Acad Sci U S A. 2015 Apr 14;112(15):4654-9. doi: 10.1073/pnas.1422023112. Epub 2015 Mar 9.
10
Non-targeted transcription factors motifs are a systemic component of ChIP-seq datasets.非靶向转录因子基序是ChIP-seq数据集的一个系统组成部分。
Genome Biol. 2014 Jul 29;15(7):412. doi: 10.1186/s13059-014-0412-4.

引用本文的文献

1
Perspectives on Codebook: sequence specificity of uncharacterized human transcription factors.密码本的视角:未表征的人类转录因子的序列特异性
bioRxiv. 2024 Nov 12:2024.11.11.622097. doi: 10.1101/2024.11.11.622097.
2
GHT-SELEX demonstrates unexpectedly high intrinsic sequence specificity and complex DNA binding of many human transcription factors.GHT-SELEX显示出许多人类转录因子具有出乎意料的高内在序列特异性和复杂的DNA结合能力。
bioRxiv. 2024 Nov 12:2024.11.11.618478. doi: 10.1101/2024.11.11.618478.
3
Effect of promoter, promoter mutation and enhancer on transgene expression mediated by episomal vectors in transfected HEK293, Chang liver and primary cells.
增强子、启动子突变和启动子对转染的 HEK293、Chang 肝和原代细胞中外显子载体介导的转基因表达的影响。
Bioengineered. 2019 Dec;10(1):548-560. doi: 10.1080/21655979.2019.1684863.
4
Toward predictive R-loop computational biology: genome-scale prediction of R-loops reveals their association with complex promoter structures, G-quadruplexes and transcriptionally active enhancers.朝着预测性 R 环计算生物学迈进:对 R 环的全基因组预测揭示了它们与复杂启动子结构、G-四联体和转录活跃增强子的关联。
Nucleic Acids Res. 2018 Sep 6;46(15):7566-7585. doi: 10.1093/nar/gky554.