Wan Xiang, Lin Guohui
Department of Computing Science, University of Alberta, Edmonton, Alberta T6G 2E8, Canada.
Comput Syst Bioinformatics Conf. 2006:55-66.
The success in backbone resonance sequential assignment is fundamental to protein three dimensional structure determination via NMR spectroscopy. Such a sequential assignment can roughly be partitioned into three separate steps, which are grouping resonance peaks in multiple spectra into spin systems, chaining the resultant spin systems into strings, and assigning strings of spin systems to non-overlapping consecutive amino acid residues in the target protein. Separately dealing with these three steps has been adopted in many existing assignment programs, and it works well on protein NMR data that is close to ideal quality, while only moderately or even poorly on most real protein datasets, where noises as well as data degeneracy occur frequently. We propose in this work to partition the sequential assignment not into physical steps, but only virtual steps, and use their outputs to cross validate each other. The novelty lies in the places where the ambiguities in the grouping step will be resolved in finding the highly confident strings in the chaining step, and the ambiguities in the chaining step will be resolved by examining the mappings of strings in the assignment step. In such a way, all ambiguities in the sequential assignment will be resolved globally and optimally. The resultant assignment program is called GASA, which was compared to several recent similar developments RIBRA, MARS, PACES and a random graph approach. The performance comparisons with these works demonstrated that GASA might be more promising for practical use.
通过核磁共振光谱法确定蛋白质三维结构,主链共振序列归属的成功至关重要。这样的序列归属大致可分为三个独立步骤,即将多个光谱中的共振峰分组为自旋系统,将所得自旋系统链接成链,以及将自旋系统链分配给目标蛋白质中不重叠的连续氨基酸残基。许多现有归属程序都分别处理这三个步骤,对于接近理想质量的蛋白质核磁共振数据效果良好,但对于大多数实际蛋白质数据集,其中噪声和数据简并频繁出现,效果仅为中等甚至较差。我们在这项工作中提出,将序列归属不是分为物理步骤,而是仅分为虚拟步骤,并利用它们的输出相互交叉验证。新颖之处在于,分组步骤中的模糊性将在链接步骤中找到高度可靠的链时得到解决,而链接步骤中的模糊性将通过在归属步骤中检查链的映射来解决。通过这种方式,序列归属中的所有模糊性将在全局范围内得到最优解决。由此产生的归属程序称为GASA,它与最近的几个类似进展RIBRA、MARS、PACES以及一种随机图方法进行了比较。与这些工作的性能比较表明,GASA在实际应用中可能更有前景。