Institute of Bioinformatics, Department of Informatic s, Ludwig-Maximilians-Universität München , Amalienstrasse 17 , 80333 München , Germany.
Graduate School of Quantitative Biosciences (QBM) , Ludwig-Maximilians-Universität München , Feodor-Lynen-Strasse 25 , 81337 München , Germany.
J Proteome Res. 2019 Apr 5;18(4):1553-1566. doi: 10.1021/acs.jproteome.8b00819. Epub 2019 Mar 8.
Spectral libraries play a central role in the analysis of data-independent-acquisition (DIA) proteomics experiments. A main assumption in current spectral library tools is that a single characteristic intensity pattern (CIP) suffices to describe the fragmentation of a peptide in a particular charge state (peptide charge pair). However, we find that this is often not the case. We carry out a systematic evaluation of spectral variability over public repositories and in-house data sets. We show that spectral variability is widespread and partly occurs under fixed experimental conditions. Using clustering of preprocessed spectra, we derive a limited number of multiple characteristic intensity patterns (MCIPs) for each peptide charge pair, which allow almost complete coverage of our heterogeneous data set without affecting the false discovery rate. We show that a MCIP library derived from public repositories performs in most cases similar to a "custom-made" spectral library, which has been acquired under identical experimental conditions as the query spectra. We apply the MCIP approach to a DIA data set and observe a significant increase in peptide recognition. We propose the MCIP approach as an easy-to-implement addition to current spectral library search engines and as a new way to utilize the data stored in spectral repositories.
光谱库在数据非依赖性采集(DIA)蛋白质组学实验的分析中起着核心作用。目前光谱库工具的一个主要假设是,单个特征强度模式(CIP)足以描述特定电荷状态(肽电荷对)下肽的片段化。然而,我们发现情况并非总是如此。我们对公共存储库和内部数据集进行了系统的光谱可变性评估。我们表明,光谱可变性很普遍,部分情况下在固定的实验条件下发生。通过对预处理光谱进行聚类,我们为每个肽电荷对导出有限数量的多个特征强度模式(MCIP),这允许在不影响假阳性率的情况下几乎完全覆盖我们的异质数据集。我们表明,从公共存储库中获得的 MCIP 库在大多数情况下的性能与在与查询光谱相同的实验条件下获得的“定制”光谱库相似。我们将 MCIP 方法应用于 DIA 数据集,并观察到肽识别率显著提高。我们建议将 MCIP 方法作为当前光谱库搜索引擎的一种易于实现的补充,并作为利用存储在光谱存储库中的数据的新方法。