Wang Lincong, Donald Bruce Randall
Dartmouth Computer Science Department, Hanover, NH 03755, USA.
Proc IEEE Comput Syst Bioinform Conf. 2005:189-202. doi: 10.1109/csb.2005.13.
Nuclear Overhauser effect (NOE) distance restraints are the main experimental data from protein nuclear magnetic resonance (NMR) spectroscopy for computing a complete three dimensional solution structure including sidechain conformations. In general, NOE restraints must be assigned before they can be used in a structure determination program. NOE assignment is very time-consuming to do manually, challenging to fully automate, and has become a key bottleneck for high-throughput NMR structure determination. The difficulty in automated NOE assignment is ambiguity: there can be tens of possible different assignments for an NOE peak based solely on its chemical shifts. Previous automated NOE assignment approaches rely on an ensemble of structures, computed from a subset of all the NOEs, to iteratively filter ambiguous assignments. These algorithms are heuristic in nature, provide no guarantees on solution quality or running time, and are slow in practice. In this paper we present an accurate, efficient NOE assignment algorithm. The algorithm first invokes the algorithm in [30, 29] to compute an accurate backbone structure using only two backbone residual dipolar couplings (RDCs) per residue. The algorithm then filters ambiguous NOE assignments by merging an ensemble of intra-residue vectors from a protein rotamer database, together with internuclear vectors from the computed backbone structure. The protein rotamer database was built from ultra-high resolution structures (<1.0 A) in the Protein Data Bank (PDB). The algorithm has been successfully applied to assign more than 1,700 NOE distance restraints with better than 90% accuracy on the protein human ubiquitin using real experimentally-recorded NMR data. The algorithm assigns these NOE restraints in less than one second on a single-processor workstation.
核Overhauser效应(NOE)距离约束是蛋白质核磁共振(NMR)光谱用于计算包括侧链构象的完整三维溶液结构的主要实验数据。一般来说,NOE约束在用于结构确定程序之前必须进行归属。手动进行NOE归属非常耗时,完全自动化具有挑战性,并且已成为高通量NMR结构确定的关键瓶颈。自动NOE归属的困难在于模糊性:仅基于其化学位移,一个NOE峰可能有数十种不同的可能归属。以前的自动NOE归属方法依赖于从所有NOE的子集中计算出的一组结构,以迭代过滤模糊的归属。这些算法本质上是启发式的,不能保证解决方案的质量或运行时间,并且在实践中速度较慢。在本文中,我们提出了一种准确、高效的NOE归属算法。该算法首先调用文献[30, 29]中的算法,仅使用每个残基的两个主链残余偶极耦合(RDC)来计算准确的主链结构。然后,该算法通过合并来自蛋白质旋转异构体数据库的一组残基内向量以及来自计算出的主链结构的核间向量,来过滤模糊的NOE归属。蛋白质旋转异构体数据库是根据蛋白质数据库(PDB)中的超高分辨率结构(<1.0 Å)构建的。该算法已成功应用于使用实际实验记录的NMR数据,对蛋白质人泛素上的1700多个NOE距离约束进行归属,准确率超过90%。该算法在单处理器工作站上不到一秒钟就能完成这些NOE约束的归属。