Miller S T, Hogle J M, Filman D J
Committee for Higher Degrees in Biophysics, Harvard University, Cambridge, MA 02138, USA.
J Mol Biol. 2001 Mar 23;307(2):499-512. doi: 10.1006/jmbi.2001.4485.
A genetic algorithm-based computational method for the ab initio phasing of diffraction data from crystals of symmetric macromolecular structures, such as icosahedral viruses, has been implemented and applied to authentic data from the P1/Mahoney strain of poliovirus. Using only single-wavelength native diffraction data, the method is shown to be able to generate correct phases, and thus electron density, to 3.0 A resolution. Beginning with no advance knowledge of the shape of the virus and only approximate knowledge of its size, the method uses a genetic algorithm to determine coarse, low-resolution (here, 20.5 A) models of the virus that obey the known non-crystallographic symmetry (NCS) constraints. The best scoring of these models are subjected to refinement and NCS-averaging, with subsequent phase extension to high resolution (3.0 A). Initial difficulties in phase extension were overcome by measuring and including all low-resolution terms in the transform. With the low-resolution data included, the method was successful in generating essentially correct phases and electron density to 6.0 A in every one of ten trials from different models identified by the genetic algorithm. Retrospective analysis revealed that these correct high-resolution solutions converged from a range of significantly different low-resolution phase sets (average differences of 59.7 degrees below 24 A). This method represents an efficient way to determine phases for icosahedral viruses, and has the advantage of producing phases free from model bias. It is expected that the method can be extended to other protein systems with high NCS.
一种基于遗传算法的计算方法已被实现并应用于脊髓灰质炎病毒P1/Mahoney株的真实数据,该方法用于从对称大分子结构(如二十面体病毒)的晶体衍射数据中进行从头相位测定。仅使用单波长天然衍射数据,该方法就能生成正确的相位,进而生成分辨率达3.0埃的电子密度图。该方法在对病毒形状没有先验知识且仅对其大小有大致了解的情况下,利用遗传算法确定符合已知非晶体学对称性(NCS)约束的病毒粗略低分辨率(此处为20.5埃)模型。对这些模型中得分最高的进行精修和NCS平均,随后将相位扩展到高分辨率(3.0埃)。通过测量并在变换中包含所有低分辨率项,克服了相位扩展初期的困难。纳入低分辨率数据后,在由遗传算法识别出的不同模型的十次试验中,每次该方法都成功生成了分辨率达6.0埃的基本正确的相位和电子密度图。回顾性分析表明,这些正确的高分辨率解是从一系列差异显著的低分辨率相位集(在24埃以下平均差异为59.7度)收敛而来的。该方法是确定二十面体病毒相位的一种有效方式,其优点是生成的相位无模型偏差。预计该方法可扩展到具有高NCS的其他蛋白质系统。