Badger J, Sauder J M, Adams J M, Antonysamy S, Bain K, Bergseid M G, Buchanan S G, Buchanan M D, Batiyenko Y, Christopher J A, Emtage S, Eroshkina A, Feil I, Furlong E B, Gajiwala K S, Gao X, He D, Hendle J, Huber A, Hoda K, Kearins P, Kissinger C, Laubert B, Lewis H A, Lin J, Loomis K, Lorimer D, Louie G, Maletic M, Marsh C D, Miller I, Molinari J, Muller-Dieckmann H J, Newman J M, Noland B W, Pagarigan B, Park F, Peat T S, Post K W, Radojicic S, Ramos A, Romero R, Rutter M E, Sanderson W E, Schwinn K D, Tresser J, Winhoven J, Wright T A, Wu L, Xu J, Harris T J R
Structural GenomiX Inc., San Diego, California, USA.
Proteins. 2005 Sep 1;60(4):787-96. doi: 10.1002/prot.20541.
The targets of the Structural GenomiX (SGX) bacterial genomics project were proteins conserved in multiple prokaryotic organisms with no obvious sequence homolog in the Protein Data Bank of known structures. The outcome of this work was 80 structures, covering 60 unique sequences and 49 different genes. Experimental phase determination from proteins incorporating Se-Met was carried out for 45 structures with most of the remainder solved by molecular replacement using members of the experimentally phased set as search models. An automated tool was developed to deposit these structures in the Protein Data Bank, along with the associated X-ray diffraction data (including refined experimental phases) and experimentally confirmed sequences. BLAST comparisons of the SGX structures with structures that had appeared in the Protein Data Bank over the intervening 3.5 years since the SGX target list had been compiled identified homologs for 49 of the 60 unique sequences represented by the SGX structures. This result indicates that, for bacterial structures that are relatively easy to express, purify, and crystallize, the structural coverage of gene space is proceeding rapidly. More distant sequence-structure relationships between the SGX and PDB structures were investigated using PDB-BLAST and Combinatorial Extension (CE). Only one structure, SufD, has a truly unique topology compared to all folds in the PDB.
结构基因组学X(SGX)细菌基因组计划的目标是多种原核生物中保守的蛋白质,这些蛋白质在已知结构的蛋白质数据库中没有明显的序列同源物。这项工作的成果是80个结构,涵盖60个独特序列和49个不同基因。对45个结构进行了含硒蛋氨酸的蛋白质的实验相位测定,其余大部分结构通过分子置换解决,使用实验相位测定集的成员作为搜索模型。开发了一种自动化工具,将这些结构存入蛋白质数据库,以及相关的X射线衍射数据(包括精修的实验相位)和经实验确认的序列。自SGX目标列表编制以来的3.5年中,将SGX结构与蛋白质数据库中出现的结构进行BLAST比较,为SGX结构所代表的60个独特序列中的49个确定了同源物。这一结果表明,对于相对容易表达、纯化和结晶的细菌结构,基因空间的结构覆盖正在迅速推进。使用PDB-BLAST和组合延伸(CE)研究了SGX和PDB结构之间更远距离的序列-结构关系。与PDB中的所有折叠相比,只有一个结构SufD具有真正独特的拓扑结构。