Department of Computer Science and Engineering, University of California, San Diego, California 92093, USA.
Mol Cell Proteomics. 2010 Dec;9(12):2772-82. doi: 10.1074/mcp.M110.002766. Epub 2010 Sep 20.
Top-down proteomics studies intact proteins, enabling new opportunities for analyzing post-translational modifications. Because tandem mass spectra of intact proteins are very complex, spectral deconvolution (grouping peaks into isotopomer envelopes) is a key initial stage for their interpretation. In such spectra, isotopomer envelopes of different protein fragments span overlapping regions on the m/z axis and even share spectral peaks. This raises both pattern recognition and combinatorial challenges for spectral deconvolution. We present MS-Deconv, a combinatorial algorithm for spectral deconvolution. The algorithm first generates a large set of candidate isotopomer envelopes for a spectrum, then represents the spectrum as a graph, and finally selects its highest scoring subset of envelopes as a heaviest path in the graph. In contrast with other approaches, the algorithm scores sets of envelopes rather than individual envelopes. We demonstrate that MS-Deconv improves on Thrash and Xtract in the number of correctly recovered monoisotopic masses and speed. We applied MS-Deconv to a large set of top-down spectra from Yersinia rohdei (with a still unsequenced genome) and further matched them against the protein database of related and sequenced bacterium Yersinia enterocolitica. MS-Deconv is available at http://proteomics.ucsd.edu/Software.html.
自上而下的蛋白质组学研究完整的蛋白质,为分析翻译后修饰提供了新的机会。由于完整蛋白质的串联质谱非常复杂,因此光谱解卷积(将峰分组到同位素包络中)是解释其的关键初始阶段。在这样的光谱中,不同蛋白质片段的同位素包络在 m/z 轴上跨越重叠区域,甚至共享光谱峰。这给光谱解卷积带来了模式识别和组合方面的挑战。我们提出了 MS-Deconv,这是一种用于光谱解卷积的组合算法。该算法首先为光谱生成一组候选同位素包络,然后将光谱表示为一个图,最后选择其得分最高的一组包络作为图中的最重路径。与其他方法相比,该算法对包络集而不是单个包络进行评分。我们证明 MS-Deconv 在正确恢复的单同位素质量数量和速度方面优于 Thrash 和 Xtract。我们将 MS-Deconv 应用于来自罗得西亚耶尔森氏菌(其基因组尚未测序)的大量自上而下的光谱,并将其进一步与相关且已测序的细菌肠炎耶尔森氏菌的蛋白质数据库进行匹配。MS-Deconv 可在 http://proteomics.ucsd.edu/Software.html 获得。