Romo Tod D, Sacchettini James C, Ioerger Thomas R
Texas A&M Center for Structural Biology, Institute for Biosciences and Technology, Houston, TX 77030, USA.
Acta Crystallogr D Biol Crystallogr. 2006 Nov;62(Pt 11):1401-6. doi: 10.1107/S0907444906034019. Epub 2006 Oct 18.
Automated methods for protein model building in X-ray crystallography typically use a two-phased approach that involves first modeling the protein backbone followed by building in the side chains. The latter phase requires the identification of the amino-acid side-chain type as well as fitting of the side-chain model into the observed electron density. While mistakes in identification of individual side chains are common for a number of reasons, sequence alignment can sometimes be used to correct errors by mapping fragments into the true (expected) amino-acid sequence and exploiting contiguity constraints among neighbors. However, side chains cannot always be confidently aligned; this depends on having sufficient accuracy in the initial calls. The recognition of amino-acid side-chains based on the surrounding pattern of electron density, whether by features, density correlation or free atoms, can be sensitive to inaccuracies in the coordinates of the predicted backbone C(alpha) atoms to which they are anchored. By incorporating a Nelder-Mead Simplex search into the side-chain identification and model-building routines of TEXTAL, it is demonstrated that this form of residue-by-residue rigid-body real-space refinement (in which the C(alpha) itself is allowed to shift) can improve the initial accuracy of side-chain selection by over 25% on average (from 25% average identity to 32% on a test set of five representative proteins, without corrections by sequence alignment). This improvement in amino-acid selection accuracy in TEXTAL is often sufficient to bring the pairwise amino-acid identity of chains in the model out of the so-called ;twilight zone' for sequence-alignment methods. When coupled with sequence alignment, use of the Simplex search yielded improvements in side-chain accuracy on average by over 13 percentage points (from 64 to 77%) and up to 38 percentage points (from 40 to 78%) in one case compared with using sequence alignment alone.
X射线晶体学中蛋白质模型构建的自动化方法通常采用两阶段方法,首先对蛋白质主链进行建模,然后构建侧链。后一阶段需要识别氨基酸侧链类型,并将侧链模型拟合到观察到的电子密度中。虽然由于多种原因,单个侧链识别错误很常见,但序列比对有时可用于通过将片段映射到真实(预期)氨基酸序列并利用相邻残基之间的连续性约束来纠正错误。然而,侧链并非总能可靠地比对;这取决于初始调用的准确性。基于周围电子密度模式识别氨基酸侧链,无论是通过特征、密度相关性还是自由原子,都可能对它们所锚定的预测主链C(α)原子坐标的不准确敏感。通过将Nelder-Mead单纯形搜索纳入TEXTAL的侧链识别和模型构建程序,结果表明这种逐个残基的刚体实空间细化形式(其中允许C(α)自身移动)平均可将侧链选择的初始准确性提高超过25%(在五个代表性蛋白质的测试集上,从平均25%的一致性提高到32%,无需通过序列比对进行校正)。TEXTAL中氨基酸选择准确性的这种提高通常足以使模型中链的成对氨基酸一致性超出序列比对方法的所谓“模糊区域”。与单独使用序列比对相比,当与序列比对结合使用时,单纯形搜索平均可使侧链准确性提高超过13个百分点(从64%提高到77%),在一种情况下最多可提高38个百分点(从40%提高到78%)。