Schwarzenbacher Robert, Godzik Adam, Jaroszewski Lukasz
University of Salzburg, Structural Biology, Billrothstrasse 11, 5020 Salzburg, Austria.
Acta Crystallogr D Biol Crystallogr. 2008 Jan;64(Pt 1):133-40. doi: 10.1107/S0907444907050111. Epub 2007 Dec 5.
The success rate of molecular replacement (MR) falls considerably when search models share less than 35% sequence identity with their templates, but can be improved significantly by using fold-recognition methods combined with exhaustive MR searches. Models based on alignments calculated with fold-recognition algorithms are more accurate than models based on conventional alignment methods such as FASTA or BLAST, which are still widely used for MR. In addition, by designing MR pipelines that integrate phasing and automated refinement and allow parallel processing of such calculations, one can effectively increase the success rate of MR. Here, updated results from the JCSG MR pipeline are presented, which to date has solved 33 MR structures with less than 35% sequence identity to the closest homologue of known structure. By using difficult MR problems as examples, it is demonstrated that successful MR phasing is possible even in cases where the similarity between the model and the template can only be detected with fold-recognition algorithms. In the first step, several search models are built based on all homologues found in the PDB by fold-recognition algorithms. The models resulting from this process are used in parallel MR searches with different combinations of input parameters of the MR phasing algorithm. The putative solutions are subjected to rigid-body and restrained crystallographic refinement and ranked based on the final values of free R factor, figure of merit and deviations from ideal geometry. Finally, crystal packing and electron-density maps are checked to identify the correct solution. If this procedure does not yield a solution with interpretable electron-density maps, then even more alternative models are prepared. The structurally variable regions of a protein family are identified based on alignments of sequences and known structures from that family and appropriate trimmings of the models are proposed. All combinations of these trimmings are applied to the search models and the resulting set of models is used in the MR pipeline. It is estimated that with the improvements in model building and exhaustive parallel searches with existing phasing algorithms, MR can be successful for more than 50% of recognizable homologues of known structures below the threshold of 35% sequence identity. This implies that about one-third of the proteins in a typical bacterial proteome are potential MR targets.
当搜索模型与其模板的序列同一性低于35%时,分子置换(MR)的成功率会大幅下降,但通过使用折叠识别方法与详尽的MR搜索相结合,成功率可显著提高。基于折叠识别算法计算的比对所构建的模型,比基于传统比对方法(如仍然广泛用于MR的FASTA或BLAST)构建的模型更准确。此外,通过设计整合相位确定和自动优化并允许此类计算并行处理的MR流程,能够有效提高MR的成功率。在此,展示了联合结构基因组学中心(JCSG)MR流程的最新结果,该流程迄今已解析出33个与已知结构的最接近同源物序列同一性低于35%的MR结构。通过以困难的MR问题为例,证明即使在模型与模板之间的相似性只能通过折叠识别算法检测到的情况下,成功的MR相位确定也是可能的。第一步,基于折叠识别算法在蛋白质数据银行(PDB)中找到的所有同源物构建多个搜索模型。此过程产生的模型用于MR相位确定算法不同输入参数组合的并行MR搜索。对假定的解决方案进行刚体和约束晶体学优化,并根据自由R因子的最终值、品质因数和与理想几何结构的偏差进行排序。最后,检查晶体堆积和电子密度图以确定正确的解决方案。如果此程序未产生具有可解释电子密度图的解决方案,则准备更多的替代模型。基于蛋白质家族的序列比对和已知结构确定该蛋白质家族的结构可变区域,并提出对模型进行适当修剪的建议。将这些修剪的所有组合应用于搜索模型,并将所得的模型集用于MR流程。据估计,随着模型构建的改进以及使用现有相位确定算法进行详尽的并行搜索,对于序列同一性低于35%阈值的已知结构的可识别同源物,超过50%的MR可以成功。这意味着典型细菌蛋白质组中约三分之一的蛋白质是潜在的MR靶标。