Suppr超能文献

利用组合策略对细胞再生过程中的转录因子-DNA 结合特异性进行全基因组推断。

Genome-wide inference of transcription factor-DNA binding specificity in cell regeneration using a combination strategy.

机构信息

Institute of Hepatobiliary Surgery, Southwest Hospital, Third Military Medical University, Chongqing 400010, China.

出版信息

Chem Biol Drug Des. 2012 Nov;80(5):734-44. doi: 10.1111/cbdd.12013. Epub 2012 Sep 10.

Abstract

The cell growth, development, and regeneration of tissue and organ are associated with a large number of gene regulation events, which are mediated in part by transcription factors (TFs) binding to cis-regulatory elements involved in the genome. Predicting the binding affinity and inferring the binding specificity of TF-DNA interactions at the genomic level would be fundamentally helpful for our understanding of the molecular mechanism and biological implication underlying sequence-specific TF-DNA recognition. In this study, we report the development of a combination method to characterize the interaction behavior of a 11-mer oligonucleotide segment and its mutations with the Gcn4p protein, a homodimeric, basic leucine zipper TF, and to predict the binding affinity and specificity of potential Gcn4p binders in the genome-wide scale. In this procedure, a position-mutated energy matrix is created based on molecular modeling analysis of native and mutated Gcn4p-DNA complex structures to describe the position-independent interaction energy profile of Gcn4p with different nucleotide types at each position of the oligonucleotide, and the energy terms extracted from the matrix and their interactives are then correlated with experimentally measured affinities of 19268 distinct oligonucleotides using statistical modeling methodology. Subsequently, the best one of built regression models is successfully applied to screen those of potential high-affinity Gcn4p binders from the complete genome. The findings arising from this study are briefly listed below: (i) The 11 positions of oligonucleotides are highly interactive and non-additive in contribution to Gcn4p-DNA binding affinity; (ii) Indirect conformational effects upon nucleotide mutations as well as associated subtle changes in interfacial atomic contacts, but not the direct nonbonded interactions, are primarily responsible for the sequence-specific recognition; (iii) The intrinsic synergistic effects among the sequence positions of oligonucleotides determine Gcn4p-DNA binding affinity and specificity; (iv) Linear regression models in conjunction with variable selection seem to perform fairly well in capturing the internal dependences hidden in the Gcn4p-DNA system, albeit ignoring nonlinear factors may lead the models to systematically underestimate and overestimate high- and low-affinity samples, respectively.

摘要

组织和器官的细胞生长、发育和再生与大量基因调控事件相关,这些事件部分由转录因子(TFs)与参与基因组的顺式调控元件结合介导。预测 TF-DNA 相互作用的结合亲和力并推断基因组水平上 TF-DNA 结合的特异性,将有助于我们理解序列特异性 TF-DNA 识别的分子机制和生物学意义。在这项研究中,我们报告了一种组合方法的开发,该方法用于表征 11 -mer 寡核苷酸片段及其与 Gcn4p 蛋白的突变体之间的相互作用行为,Gcn4p 蛋白是一种同源二聚体碱性亮氨酸拉链 TF,并预测全基因组范围内潜在 Gcn4p 结合物的结合亲和力和特异性。在该过程中,基于天然和突变 Gcn4p-DNA 复合物结构的分子建模分析,创建一个位置突变的能量矩阵,以描述寡核苷酸中每个位置的 Gcn4p 与不同核苷酸类型的位置独立相互作用能量分布,并从矩阵中提取能量项及其交互项,然后使用统计建模方法与 19268 个不同寡核苷酸的实验测量亲和力相关联。随后,成功地将构建的回归模型中的最佳模型应用于从完整基因组中筛选潜在的高亲和力 Gcn4p 结合物。本研究的结果简要列出如下:(i)寡核苷酸的 11 个位置在 Gcn4p-DNA 结合亲和力中高度相互作用且不可加;(ii)核苷酸突变引起的间接构象效应以及界面原子接触的相关细微变化,而不是直接的非键相互作用,是序列特异性识别的主要原因;(iii)寡核苷酸序列位置的内在协同效应决定了 Gcn4p-DNA 的结合亲和力和特异性;(iv)线性回归模型与变量选择相结合,似乎可以很好地捕捉 Gcn4p-DNA 系统中隐藏的内部依赖性,尽管忽略非线性因素可能会导致模型系统地低估和高估高亲和力和低亲和力的样本。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验