Guo Haiwei H, Choe Juno, Loeb Lawrence A
Joseph Gottstein Memorial Cancer Laboratory, Department of Pathology, University of Washington School of Medicine, Seattle, 98195-7705, USA.
Proc Natl Acad Sci U S A. 2004 Jun 22;101(25):9205-10. doi: 10.1073/pnas.0403255101. Epub 2004 Jun 14.
Mutagenesis of protein-encoding sequences occurs ubiquitously; it enables evolution, accumulates during aging, and is associated with disease. Many biotechnological methods exploit random mutations to evolve novel proteins. To quantitate protein tolerance to random change, it is vital to understand the probability that a random amino acid replacement will lead to a protein's functional inactivation. We define this probability as the "x factor." Here, we develop a broadly applicable approach to calculate x factors and demonstrate this method using the human DNA repair enzyme 3-methyladenine DNA glycosylase (AAG). Three gene-wide mutagenesis libraries were created, each with 10(5) diversity and averaging 2.2, 4.6, and 6.2 random amino acid changes per mutant. After determining the percentage of functional mutants in each library using high-stringency selection (>19,000-fold), the x factor was found to be 34% +/- 6%. Remarkably, reanalysis of data from studies of diverse proteins reveals similar inactivation probabilities. To delineate the nature of tolerated amino acid substitutions, we sequenced 244 surviving AAG mutants. The 920 tolerated substitutions were characterized by substitutability index and mapped onto the AAG primary, secondary, and known tertiary structures. Evolutionarily conserved residues show low substitutability indices. In AAG, beta strands are on average less substitutable than alpha helices; and surface loops that are not involved in DNA binding are the most substitutable. Our results are relevant to such diverse topics as applied molecular evolution, the rate of introduction of deleterious alleles into genomes in evolutionary history, and organisms' tolerance of mutational burden.
蛋白质编码序列的诱变普遍存在;它推动了进化,在衰老过程中不断累积,并且与疾病相关。许多生物技术方法利用随机突变来进化出新的蛋白质。为了量化蛋白质对随机变化的耐受性,了解随机氨基酸替换导致蛋白质功能失活的概率至关重要。我们将此概率定义为“x因子”。在此,我们开发了一种广泛适用的方法来计算x因子,并以人类DNA修复酶3-甲基腺嘌呤DNA糖基化酶(AAG)为例展示了该方法。构建了三个全基因诱变文库,每个文库具有10^5的多样性,每个突变体平均有2.2、4.6和6.2个随机氨基酸变化。在使用高严格度筛选(>19,000倍)确定每个文库中功能突变体的百分比后,发现x因子为34%±6%。值得注意的是,对来自多种蛋白质研究数据的重新分析揭示了相似的失活概率。为了描绘可耐受氨基酸替换的性质,我们对244个存活的AAG突变体进行了测序。920个可耐受的替换通过可替换性指数进行了表征,并映射到AAG的一级、二级和已知三级结构上。进化保守残基的可替换性指数较低。在AAG中,β链平均比α螺旋更不易被替换;而不参与DNA结合的表面环最易被替换。我们的结果与诸如应用分子进化、进化历史中有害等位基因引入基因组的速率以及生物体对突变负担的耐受性等不同主题相关。