Bricogne G
MRC Laboratory of Molecular Biology, Cambridge, England.
Acta Crystallogr D Biol Crystallogr. 1993 Jan 1;49(Pt 1):37-60. doi: 10.1107/S0907444992010400.
A new multisolution phasing method based on entropy maximization and likelihood ranking, proposed for the specific purpose of extending probabilistic direct methods to the field of macromolecules, has been implemented in two different computer programs and applied to a wide variety of problems. The latter comprise the determination of small crystal structures from X-ray diffraction data obtained from single crystals or from powders, and from electron diffraction data partially phased by image processing of electron micrographs, the ab initio generation and ranking of phase sets for small proteins; and the improvement of poor quality phases for a larger protein at medium resolution under constraint of solvent flatness. These applications show that the primary goal of this new method - namely increasing the accuracy and sensitivity of probabilistic phase indications compared with conventional direct methods - has been achieved. The main components of the method are (1) a tree-directed search through a space of trial phase sets; (2) the saddle-point method for calculating joint probabilities of structure factors, using entropy maximization; (3) likelihood-based scores to rank trial phase sets and prune the search tree; (4) efficient schemes, based on error-correcting codes, for sampling trial phase sets; (5) a statistical analysis of the scores for automatically selecting reliable phase indications. They have been implemented to varying degrees of completeness in a computer program (BUSTER) and tested on two small structures as well as on the small protein crambin. The main obstructions to successful ab initio phasing in the latter case seem to reside in the accumulation of phase sampling errors and in the lack of a properly defined molecular envelope, both of which can be remedied within the methods proposed. A review of the Bayesian statistical theory encompassing all phasing procedures, proposed earlier as an extension of the initial theory, shows that the techniques now available in BUSTER bring closer a number of major enhancements of standard macromolecular phasing techniques, namely isomorphous replacement, molecular replacement, solvent flattening and non-crystallographic symmetry averaging. The gradual implementation of the successive stages of this 'Bayesian programme' should lead to an increasingly integrated, effective and dependable phasing procedure for macromolecular structure determination.
一种基于熵最大化和似然性排序的新的多解相法,是为将概率直接法扩展到大分子领域这一特定目的而提出的,已在两个不同的计算机程序中实现,并应用于各种各样的问题。后者包括从单晶或粉末获得的X射线衍射数据以及通过电子显微镜图像处理部分定相的电子衍射数据确定小晶体结构,从小蛋白质的相位集中从头生成并排序,以及在溶剂平坦度约束下提高中等分辨率下较大蛋白质的低质量相位。这些应用表明,与传统直接法相比,这种新方法的主要目标——即提高概率相位指示的准确性和灵敏度——已经实现。该方法的主要组成部分包括:(1)通过试验相位集空间进行树状搜索;(2)使用熵最大化计算结构因子联合概率的鞍点法;(3)基于似然性的分数对试验相位集进行排序并修剪搜索树;(4)基于纠错码的有效方案,用于对试验相位集进行采样;(5)对分数进行统计分析,以自动选择可靠的相位指示。它们已在一个计算机程序(BUSTER)中以不同程度的完整性实现,并在两个小结构以及小蛋白质胰蛋白酶抑制剂上进行了测试。在后一种情况下,从头成功定相的主要障碍似乎在于相位采样误差的积累以及缺乏适当定义的分子包络,而这两者都可以在所提出的方法中得到补救。对早期作为初始理论扩展提出的涵盖所有定相程序的贝叶斯统计理论的回顾表明,BUSTER中现有的技术使标准大分子定相技术的一些主要增强功能更接近,即同晶置换、分子置换、溶剂平坦化和非晶体学对称性平均。这个“贝叶斯程序”连续阶段的逐步实施应该会导致一种用于大分子结构测定的越来越集成、有效和可靠的定相程序。