Rinaldi Fabio C, Doyle Lindsey A, Stoddard Barry L, Bogdanove Adam J
Plant Pathology and Plant-Microbe Biology Section, School of Integrative Plant Science, Cornell University, Ithaca, NY 14853, USA.
Division of Basic Sciences, Fred Hutchinson Cancer Research, Seattle, WA 98019, USA.
Nucleic Acids Res. 2017 Jun 20;45(11):6960-6970. doi: 10.1093/nar/gkx342.
Transcription activator-like effectors (TALEs) recognize their DNA targets via tandem repeats, each specifying a single nucleotide base in a one-to-one sequential arrangement. Due to this modularity and their ability to bind long DNA sequences with high specificity, TALEs have been used in many applications. Contributions of individual repeat-nucleotide associations to affinity and specificity have been characterized. Here, using in vitro binding assays, we examined the relationship between the number of repeats in a TALE and its affinity, for both target and non-target DNA. Each additional repeat provides extra binding energy for the target DNA, with the gain decaying exponentially such that binding energy saturates. Affinity for non-target DNA also increases non-linearly with the number of repeats, but with a slower decay of gain. The difference between the effect of length on affinity for target versus non-target DNA manifests in specificity increasing then diminishing with increasing TALE length, peaking between 15 and 19 repeats. Modeling across different hypothetical saturation levels and rates of gain decay, reflecting different repeat compositions, yielded a similar range of specificity optima. This range encompasses the mean and median length of native TALEs, suggesting that these proteins as a group have evolved for maximum specificity.
转录激活样效应因子(TALEs)通过串联重复序列识别其DNA靶标,每个重复序列以一对一的顺序排列指定单个核苷酸碱基。由于这种模块化特性以及它们以高特异性结合长DNA序列的能力,TALEs已被用于许多应用中。单个重复序列-核苷酸碱基关联对亲和力和特异性的贡献已得到表征。在这里,我们使用体外结合试验,研究了TALEs中重复序列的数量与其对靶标DNA和非靶标DNA的亲和力之间的关系。每个额外的重复序列为靶标DNA提供额外的结合能,这种增加呈指数衰减,使得结合能达到饱和。对非靶标DNA的亲和力也随着重复序列数量的增加而非线性增加,但增加的衰减速度较慢。长度对靶标DNA和非靶标DNA亲和力的影响差异表现为,随着TALEs长度增加,特异性先增加后降低,在15至19个重复序列之间达到峰值。对不同假设饱和水平和增益衰减率进行建模,反映不同的重复序列组成,得到了相似范围的最佳特异性。这个范围涵盖了天然TALEs的平均长度和中位数长度,表明作为一个群体,这些蛋白质已经进化到具有最大特异性。