Department of Computer Science & Engineering, The Chinese University of Hong Kong, Shatin, N T, Hong Kong.
Nucleic Acids Res. 2012 Oct;40(19):9392-403. doi: 10.1093/nar/gks749. Epub 2012 Aug 16.
In protein-DNA interactions, particularly transcription factor (TF) and transcription factor binding site (TFBS) bindings, associated residue variations form patterns denoted as subtypes. Subtypes may lead to changed binding preferences, distinguish conserved from flexible binding residues and reveal novel binding mechanisms. However, subtypes must be studied in the context of core bindings. While solving 3D structures would require huge experimental efforts, recent sequence-based associated TF-TFBS pattern discovery has shown to be promising, upon which a large-scale subtype study is possible and desirable. In this article, we investigate residue-varying subtypes based on associated TF-TFBS patterns. By re-categorizing the patterns with respect to varying TF amino acids, statistically significant (P values ≤ 0.005) subtypes leading to varying TFBS patterns are discovered without using TF family or domain annotations. Resultant subtypes have various biological meanings. The subtypes reflect familial and functional properties and exhibit changed binding preferences supported by 3D structures. Conserved residues critical for maintaining TF-TFBS bindings are revealed by analyzing the subtypes. In-depth analysis on the subtype pair PKVVIL-CACGTG versus PKVEIL-CAGCTG shows the V/E variation is indicative for distinguishing Myc from MRF families. Discovered from sequences only, the TF-TFBS subtypes are informative and promising for more biological findings, complementing and extending recent one-sided subtype and familial studies with comprehensive evidence.
在蛋白质-DNA 相互作用中,特别是转录因子(TF)和转录因子结合位点(TFBS)的结合,相关残基的变异形成了称为亚型的模式。亚型可能导致结合偏好的改变,区分保守和灵活的结合残基,并揭示新的结合机制。然而,亚型必须在核心结合的背景下进行研究。虽然解决 3D 结构需要大量的实验努力,但最近基于序列的相关 TF-TFBS 模式发现已经显示出很有前途,在此基础上可以进行大规模的亚型研究,并且是值得的。在本文中,我们研究了基于相关 TF-TFBS 模式的残基变异亚型。通过重新分类与变化的 TF 氨基酸相关的模式,我们发现了具有统计学意义(P 值≤0.005)的导致变化的 TFBS 模式的变化亚型,而无需使用 TF 家族或结构域注释。由此产生的亚型具有各种生物学意义。这些亚型反映了家族和功能特性,并表现出改变的结合偏好,这些偏好得到了 3D 结构的支持。通过分析亚型,揭示了维持 TF-TFBS 结合的保守残基的关键作用。对亚型对 PKVVIL-CACGTG 与 PKVEIL-CAGCTG 的深入分析表明,V/E 变异可用于区分 Myc 和 MRF 家族。仅从序列中发现的 TF-TFBS 亚型具有信息性和前景,可为更多的生物学发现提供依据,用全面的证据补充和扩展最近的片面亚型和家族研究。