Lange Eva, Gröpl Clemens, Schulz-Trieglaff Ole, Leinenbach Andreas, Huber Christian, Reinert Knut
Free University Berlin, Department of Mathematics and Computer Science, Berlin, Germany.
Bioinformatics. 2007 Jul 1;23(13):i273-81. doi: 10.1093/bioinformatics/btm209.
Liquid chromatography coupled to mass spectrometry (LC-MS) and combined with tandem mass spectrometry (LC-MS/MS) have become a prominent tool for the analysis of complex proteomic samples. An important step in a typical workflow is the combination of results from multiple LC-MS experiments to improve confidence in the obtained measurements or to compare results from different samples. To do so, a suitable mapping or alignment between the data sets needs to be estimated. The alignment has to correct for variations in mass and elution time which are present in all mass spectrometry experiments.
We propose a novel algorithm to align LC-MS samples and to match corresponding ion species across samples. Our algorithm matches landmark signals between two data sets using a geometric technique based on pose clustering. Variations in mass and retention time are corrected by an affine dewarping function estimated from matched landmarks. We use the pairwise dewarping in an algorithm for aligning multiple samples. We show that our pose clustering approach is fast and reliable as compared to previous approaches. It is robust in the presence of noise and able to accurately align samples with only few common ion species. In addition, we can easily handle different kinds of LC-MS data and adopt our algorithm to new mass spectrometry technologies.
This algorithm is implemented as part of the OpenMS software library for shotgun proteomics and available under the Lesser GNU Public License (LGPL) at www.openms.de.
液相色谱-质谱联用(LC-MS)以及液相色谱-串联质谱联用(LC-MS/MS)已成为分析复杂蛋白质组学样品的重要工具。在典型的工作流程中,一个重要步骤是将多个LC-MS实验的结果相结合,以提高对所得测量结果的可信度,或比较不同样品的结果。为此,需要估计数据集之间合适的映射或比对。这种比对必须校正所有质谱实验中存在的质量和洗脱时间的变化。
我们提出了一种新颖的算法,用于比对LC-MS样品并匹配不同样品间相应的离子种类。我们的算法使用基于姿态聚类的几何技术来匹配两个数据集之间的标志性信号。通过从匹配的标志性信号估计的仿射去扭曲函数来校正质量和保留时间的变化。我们在一种用于比对多个样品的算法中使用成对去扭曲。我们表明,与先前的方法相比,我们的姿态聚类方法快速且可靠。在存在噪声的情况下它很稳健,并且能够仅用很少的共同离子种类准确地比对样品。此外,我们可以轻松处理不同类型的LC-MS数据,并将我们的算法应用于新的质谱技术。
此算法作为用于鸟枪法蛋白质组学的OpenMS软件库的一部分实现,并可在www.openms.de上根据较小的GNU公共许可证(LGPL)获得。