Eghbalnia Hamid R, Bahrami Arash, Wang Liya, Assadi Amir, Markley John L
Biochemistry Department, National Magnetic Resonance Facility at Madison, 433, Babcock Drive, Madison, WI 53706, USA.
J Biomol NMR. 2005 Jul;32(3):219-33. doi: 10.1007/s10858-005-7944-6.
We present a novel automated strategy (PISTACHIO) for the probabilistic assignment of backbone and sidechain chemical shifts in proteins. The algorithm uses peak lists derived from various NMR experiments as input and provides as output ranked lists of assignments for all signals recognized in the input data as constituting spin systems. PISTACHIO was evaluated by comparing its performance with raw peak-picked data from 15 proteins ranging from 54 to 300 residues; the results were compared with those achieved by experts analyzing the same datasets by hand. As scored against the best available independent assignments for these proteins, the first-ranked PISTACHIO assignments were 80-100% correct for backbone signals and 75-95% correct for sidechain signals. The independent assignments benefited, in a number of cases, from structural data (e.g. from NOESY spectra) that were unavailable to PISTACHIO. Any number of datasets in any combination can serve as input. Thus PISTACHIO can be used as datasets are collected to ascertain the current extent of secure assignments, to identify residues with low assignment probability, and to suggest the types of additional data needed to remove ambiguities. The current implementation of PISTACHIO, which is available from a server on the Internet, supports input data from 15 standard double- and triple-resonance experiments. The software can readily accommodate additional types of experiments, including data from selectively labeled samples. The assignment probabilities can be carried forward and refined in subsequent steps leading to a structure. The performance of PISTACHIO showed no direct dependence on protein size, but correlated instead with data quality (completeness and signal-to-noise). PISTACHIO represents one component of a comprehensive probabilistic approach we are developing for the collection and analysis of protein NMR data.
我们提出了一种全新的自动化策略(PISTACHIO),用于蛋白质主链和侧链化学位移的概率分配。该算法将源自各种核磁共振实验的峰列表用作输入,并输出对输入数据中识别为构成自旋系统的所有信号的分配排名列表。通过将PISTACHIO的性能与来自15种蛋白质(残基数从54到300)的原始峰挑选数据进行比较来评估其性能;将结果与专家手动分析相同数据集所取得的结果进行比较。根据这些蛋白质可获得的最佳独立分配进行评分,PISTACHIO排名第一的分配对于主链信号的正确率为80 - 100%,对于侧链信号的正确率为75 - 95%。在许多情况下,独立分配受益于PISTACHIO无法获得的结构数据(例如来自NOESY谱)。任意数量的数据集以任何组合都可作为输入。因此,在收集数据集时可以使用PISTACHIO来确定当前可靠分配的程度,识别分配概率低的残基,并建议消除歧义所需的其他数据类型。PISTACHIO的当前实现可从互联网上的服务器获取,支持来自15种标准双共振和三共振实验的输入数据。该软件可以轻松容纳其他类型的实验,包括来自选择性标记样品的数据。分配概率可以在后续构建结构的步骤中延续并细化。PISTACHIO的性能与蛋白质大小没有直接关系,而是与数据质量(完整性和信噪比)相关。PISTACHIO是我们正在开发的用于蛋白质核磁共振数据收集和分析的综合概率方法的一个组成部分。