Yan Bo, Pan Chongle, Olman Victor N, Hettich Robert L, Xu Ying
Computational Systems Biology Laboratory, Department of Biochemical and Molecular Biology, University of Georgia, Athens, GA 30602, USA.
Bioinformatics. 2005 Mar 1;21(5):563-74. doi: 10.1093/bioinformatics/bti044. Epub 2004 Sep 28.
Ion-type identification is a fundamental problem in computational proteomics. Methods for accurate identification of ion types provide the basis for many mass spectrometry data interpretation problems, including (a) de novo sequencing, (b) identification of post-translational modifications and mutations and (c) validation of database search results.
Here, we present a novel graph-theoretic approach for solving the problem of separating b ions from y ions in a set of tandem mass spectra. We represent each spectral peak as a node and consider two types of edges: type-1 edge connecting two peaks probably of the same ion types and type-2 edge connecting two peaks probably of different ion types. The problem of ion-separation is formulated and solved as a graph partition problem, which is to partition the graph into three subgraphs, representing b, y and others ions, respectively, through maximizing the total weight of type-1 edges while minimizing the total weight of type-2 edges within each partitioned subgraph. We have developed a dynamic programming algorithm for rigorously solving this graph partition problem and implemented it as a computer program PRIME (PaRtition of Ion types in tandem Mass spEctra). The tests on a large amount of simulated mass spectra and 19 sets of high-quality experimental Fourier transform ion cyclotron resonance tandem mass spectra indicate that an accuracy level of approximately 90% for the separation of b and y ions was achieved.
The executable code of PRIME is available upon request.
离子类型识别是计算蛋白质组学中的一个基本问题。准确识别离子类型的方法为许多质谱数据解释问题提供了基础,包括(a)从头测序,(b)翻译后修饰和突变的识别,以及(c)数据库搜索结果的验证。
在此,我们提出了一种新颖的图论方法,用于解决在一组串联质谱中将b离子与y离子分离的问题。我们将每个光谱峰表示为一个节点,并考虑两种类型的边:连接两个可能属于相同离子类型的峰的1型边,以及连接两个可能属于不同离子类型的峰的2型边。离子分离问题被公式化为一个图划分问题,即通过最大化每个划分后的子图内1型边的总权重,同时最小化2型边的总权重,将图划分为三个子图,分别代表b离子、y离子和其他离子。我们开发了一种动态规划算法来严格求解这个图划分问题,并将其实现为一个计算机程序PRIME(串联质谱中离子类型的划分)。对大量模拟质谱和19组高质量实验傅里叶变换离子回旋共振串联质谱的测试表明,b离子和y离子分离的准确率达到了约90%。
可根据要求提供PRIME的可执行代码。