Keller Andrew, Eng Jimmy, Zhang Ning, Li Xiao-jun, Aebersold Ruedi
Institute for Systems Biology, Seattle, WA, USA.
Mol Syst Biol. 2005;1:2005.0017. doi: 10.1038/msb4100024. Epub 2005 Aug 2.
The analysis of tandem mass (MS/MS) data to identify and quantify proteins is hampered by the heterogeneity of file formats at the raw spectral data, peptide identification, and protein identification levels. Different mass spectrometers output their raw spectral data in a variety of proprietary formats, and alternative methods that assign peptides to MS/MS spectra and infer protein identifications from those peptide assignments each write their results in different formats. Here we describe an MS/MS analysis platform, the Trans-Proteomic Pipeline, which makes use of open XML file formats for storage of data at the raw spectral data, peptide, and protein levels. This platform enables uniform analysis and exchange of MS/MS data generated from a variety of different instruments, and assigned peptides using a variety of different database search programs. We demonstrate this by applying the pipeline to data sets generated by ThermoFinnigan LCQ, ABI 4700 MALDI-TOF/TOF, and Waters Q-TOF instruments, and searched in turn using SEQUEST, Mascot, and COMET.
串联质谱(MS/MS)数据用于蛋白质鉴定和定量分析时,在原始光谱数据、肽段鉴定及蛋白质鉴定层面,会因文件格式的异质性而受到阻碍。不同的质谱仪以多种专有格式输出其原始光谱数据,而将肽段与MS/MS谱图匹配并从这些肽段匹配结果推断蛋白质鉴定的不同方法,各自将结果写成不同格式。在此,我们描述了一个MS/MS分析平台——跨蛋白质组学管道,它在原始光谱数据、肽段和蛋白质层面利用开放的XML文件格式存储数据。该平台能够对源自各种不同仪器并使用各种不同数据库搜索程序进行肽段匹配所生成的MS/MS数据进行统一分析和交换。我们通过将该管道应用于由ThermoFinnigan LCQ、ABI 4700 MALDI-TOF/TOF和Waters Q-TOF仪器生成的数据集,并依次使用SEQUEST、Mascot和COMET进行搜索来证明这一点。