Cokol M, Nair R, Rost B
CUBIC, Columbia University, Department of Biochemistry and Molecular Biophysics, New York, NY 10032, USA.
EMBO Rep. 2000 Nov;1(5):411-5. doi: 10.1093/embo-reports/kvd092.
A variety of nuclear localization signals (NLSs) are experimentally known although only one motif was available for database searches through PROSITE. We initially collected a set of 91 experimentally verified NLSs from the literature. Through iterated 'in silico mutagenesis' we then extended the set to 214 potential NLSs. This final set matched in 43% of all known nuclear proteins and in no known non-nuclear protein. We estimated that >17% of all eukaryotic proteins may be imported into the nucleus. Finally, we found an overlap between the NLS and DNA-binding region for 90% of the proteins for which both the NLS and DNA-binding regions were known. Thus, evolution seemed to have used part of the existing DNA-binding mechanism when compartmentalizing DNA-binding proteins into the nucleus. However, only 56 of our 214 NLS motifs overlapped with DNA-binding regions. These 56 NLSs enabled a de novo prediction of partial DNA-binding regions for approximately 800 proteins in human, fly, worm and yeast.
尽管通过PROSITE进行数据库搜索时只有一种基序可用,但实验上已知多种核定位信号(NLS)。我们最初从文献中收集了一组91个经实验验证的NLS。然后通过反复的“计算机诱变”,我们将该集合扩展到214个潜在的NLS。这一最终集合与所有已知核蛋白中的43%匹配,且与任何已知的非核蛋白均不匹配。我们估计,所有真核生物蛋白中超过17%可能被导入细胞核。最后,对于已知NLS和DNA结合区域的90%的蛋白质,我们发现NLS和DNA结合区域之间存在重叠。因此,在将DNA结合蛋白分隔到细胞核中时,进化似乎利用了部分现有的DNA结合机制。然而,我们的214个NLS基序中只有56个与DNA结合区域重叠。这56个NLS能够对人类、果蝇、蠕虫和酵母中约800种蛋白质的部分DNA结合区域进行从头预测。