Headd Jeffrey J, Immormino Robert M, Keedy Daniel A, Emsley Paul, Richardson David C, Richardson Jane S
Department of Biochemistry, Duke University Medical Center, 211 Nanaline Duke Building, 3711 DUMC, Durham, NC 27710, USA.
J Struct Funct Genomics. 2009 Mar;10(1):83-93. doi: 10.1007/s10969-008-9045-8. Epub 2008 Nov 11.
Misfit sidechains in protein crystal structures are a stumbling block in using those structures to direct further scientific inference. Problems due to surface disorder and poor electron density are very difficult to address, but a large class of systematic errors are quite common even in well-ordered regions, resulting in sidechains fit backwards into local density in predictable ways. The MolProbity web site is effective at diagnosing such errors, and can perform reliable automated correction of a few special cases such as 180 degrees flips of Asn or Gln sidechain amides, using all-atom contacts and H-bond networks. However, most at-risk residues involve tetrahedral geometry, and their valid correction requires rigorous evaluation of sidechain movement and sometimes backbone shift. The current work extends the benefits of robust automated correction to more sidechain types. The Autofix method identifies candidate systematic, flipped-over errors in Leu, Thr, Val, and Arg using MolProbity quality statistics, proposes a corrected position using real-space refinement with rotamer selection in Coot, and accepts or rejects the correction based on improvement in MolProbity criteria and on chi angle change. Criteria are chosen conservatively, after examining many individual results, to ensure valid correction. To test this method, Autofix was run and analyzed for 945 representative PDB files and on the 50S ribosomal subunit of file 1YHQ. Over 40% of Leu, Val, and Thr outliers and 15% of Arg outliers were successfully corrected, resulting in a total of 3,679 corrected sidechains, or 4 per structure on average. Summary Sentences: A common class of misfit sidechains in protein crystal structures is due to systematic errors that place the sidechain backwards into the local electron density. A fully automated method called "Autofix" identifies such errors for Leu, Val, Thr, and Arg and corrects over one third of them, using MolProbity validation criteria and Coot real-space refinement of rotamers.
蛋白质晶体结构中不匹配的侧链是利用这些结构进行进一步科学推断的绊脚石。由于表面无序和电子密度差导致的问题很难解决,但即使在有序区域,一大类系统误差也相当常见,导致侧链以可预测的方式反向拟合到局部密度中。MolProbity网站在诊断此类误差方面很有效,并且可以使用全原子接触和氢键网络对一些特殊情况(如天冬酰胺或谷氨酰胺侧链酰胺的180度翻转)进行可靠的自动校正。然而,大多数有风险的残基涉及四面体几何结构,对它们进行有效的校正需要对侧链移动以及有时对主链移动进行严格评估。当前的工作将强大的自动校正的益处扩展到更多的侧链类型。自动修复(Autofix)方法使用MolProbity质量统计数据识别亮氨酸、苏氨酸、缬氨酸和精氨酸中可能存在的系统性、翻转错误,使用Coot中带有旋转异构体选择的实空间精修提出校正位置,并根据MolProbity标准的改进情况和卡角变化接受或拒绝校正。在检查了许多单独的结果后,保守地选择标准以确保有效的校正。为了测试此方法,对945个代表性的蛋白质数据银行(PDB)文件以及文件1YHQ的50S核糖体亚基运行并分析了自动修复方法。超过40%的亮氨酸、缬氨酸和苏氨酸异常值以及15%的精氨酸异常值被成功校正,总共校正了3679个侧链,平均每个结构校正4个。总结句:蛋白质晶体结构中一类常见的不匹配侧链是由于系统误差导致侧链反向拟合到局部电子密度中。一种名为“自动修复”的全自动方法使用MolProbity验证标准和Coot对旋转异构体的实空间精修,识别亮氨酸、缬氨酸、苏氨酸和精氨酸中的此类错误,并校正其中超过三分之一的错误。