May Damon H, Tamura Kaipo, Noble William S
Department of Genome Sciences, University of Washington , Seattle, Washington 98195, United States.
Department of Computer Science and Engineering, University of Washington , Seattle, Washington 98195, United States.
J Proteome Res. 2017 Apr 7;16(4):1817-1824. doi: 10.1021/acs.jproteome.7b00028. Epub 2017 Mar 13.
In shotgun proteomics analysis, user-specified parameters are critical to database search performance and therefore to the yield of confident peptide-spectrum matches (PSMs). Two of the most important parameters are related to the accuracy of the mass spectrometer. Precursor mass tolerance defines the peptide candidates considered for each spectrum. Fragment mass tolerance or bin size determines how close observed and theoretical fragments must be to be considered a match. For either of these two parameters, too wide a setting yields randomly high-scoring false PSMs, whereas too narrow a setting erroneously excludes true PSMs, in both cases, lowering the yield of peptides detected at a given false discovery rate. We describe a strategy for inferring optimal search parameters by assembling and analyzing pairs of spectra that are likely to have been generated by the same peptide ion to infer precursor and fragment mass error. This strategy does not rely on a database search, making it usable in a wide variety of settings. In our experiments on data from a variety of instruments including Orbitrap and Q-TOF acquisitions, this strategy yields more high-confidence PSMs than using settings based on instrument defaults or determined by experts. Param-Medic is open-source and cross-platform. It is available as a standalone tool ( http://noble.gs.washington.edu/proj/param-medic/ ) and has been integrated into the Crux proteomics toolkit ( http://crux.ms ), providing automatic parameter selection for the Comet and Tide search engines.
在鸟枪法蛋白质组学分析中,用户指定的参数对于数据库搜索性能至关重要,因此对于可靠的肽段谱匹配(PSM)产量也至关重要。其中两个最重要的参数与质谱仪的准确性相关。前体质量容差定义了针对每个谱考虑的肽候选物。碎片质量容差或区间大小决定了观察到的和理论上的碎片必须有多接近才能被视为匹配。对于这两个参数中的任何一个,设置过宽会产生随机高分的错误PSM,而设置过窄会错误地排除真正的PSM,在这两种情况下,都会降低在给定错误发现率下检测到的肽段产量。我们描述了一种通过组装和分析可能由同一肽离子产生的谱对来推断最佳搜索参数的策略,以推断前体和碎片质量误差。该策略不依赖于数据库搜索,使其可在多种设置中使用。在我们对来自包括Orbitrap和Q-TOF采集在内的各种仪器的数据进行的实验中,该策略比使用基于仪器默认设置或由专家确定的设置产生更多高可信度的PSM。Param-Medic是开源且跨平台的。它可作为独立工具(http://noble.gs.washington.edu/proj/param-medic/)获得,并已集成到Crux蛋白质组学工具包(http://crux.ms)中,为Comet和Tide搜索引擎提供自动参数选择。