Department of Biochemistry and Molecular Biology, University of Chicago, Chicago, Illinois 60637, USA.
Protein Sci. 2010 Mar;19(3):520-34. doi: 10.1002/pro.330.
For naturally occurring proteins, similar sequence implies similar structure. Consequently, multiple sequence alignments (MSAs) often are used in template-based modeling of protein structure and have been incorporated into fragment-based assembly methods. Our previous homology-free structure prediction study introduced an algorithm that mimics the folding pathway by coupling the formation of secondary and tertiary structure. Moves in the Monte Carlo procedure involve only a change in a single pair of phi,psi backbone dihedral angles that are obtained from a Protein Data Bank-based distribution appropriate for each amino acid, conditional on the type and conformation of the flanking residues. We improve this method by using MSAs to enrich the sampling distribution, but in a manner that does not require structural knowledge of any protein sequence (i.e., not homologous fragment insertion). In combination with other tools, including clustering and refinement, the accuracies of the predicted secondary and tertiary structures are substantially improved and a global and position-resolved measure of confidence is introduced for the accuracy of the predictions. Performance of the method in the Critical Assessment of Structure Prediction (CASP8) is discussed.
对于天然存在的蛋白质,相似的序列意味着相似的结构。因此,基于模板的蛋白质结构建模经常使用多重序列比对(MSA),并且已经被纳入基于片段的组装方法中。我们之前的无同源性结构预测研究引入了一种算法,通过耦合二级和三级结构的形成来模拟折叠途径。蒙特卡罗过程中的移动仅涉及单个对 phi、psi 骨架二面角的变化,这些二面角是根据适用于每个氨基酸的基于蛋白质数据库的分布获得的,条件是侧翼残基的类型和构象。我们通过使用 MSA 来丰富采样分布来改进这种方法,但不需要任何蛋白质序列的结构知识(即,不是同源片段插入)。与其他工具(包括聚类和细化)结合使用,可以大大提高预测二级和三级结构的准确性,并为预测的准确性引入全局和位置分辨率的置信度度量。讨论了该方法在结构预测关键评估(CASP8)中的性能。