Akhtar Malik N, Southey Bruce R, Andrén Per E, Sweedler Jonathan V, Rodriguez-Zas Sandra L
Department of Animal Sciences, University of Illinois at Urbana-Champaign, Urbana, Illinois, United States of America.
Department of Pharmaceutical Biosciences, Uppsala University, Uppsala, Sweden.
PLoS One. 2014 Oct 17;9(10):e111112. doi: 10.1371/journal.pone.0111112. eCollection 2014.
In support of accurate neuropeptide identification in mass spectrometry experiments, novel Monte Carlo permutation testing was used to compute significance values. Testing was based on k-permuted decoy databases, where k denotes the number of permutations. These databases were integrated with a range of peptide identification indicators from three popular open-source database search software (OMSSA, Crux, and X! Tandem) to assess the statistical significance of neuropeptide spectra matches. Significance p-values were computed as the fraction of the sequences in the database with match indicator value better than or equal to the true target spectra. When applied to a test-bed of all known manually annotated mouse neuropeptides, permutation tests with k-permuted decoy databases identified up to 100% of the neuropeptides at p-value < 10(-5). The permutation test p-values using hyperscore (X! Tandem), E-value (OMSSA) and Sp score (Crux) match indicators outperformed all other match indicators. The robust performance to detect peptides of the intuitive indicator "number of matched ions between the experimental and theoretical spectra" highlights the importance of considering this indicator when the p-value was borderline significant. Our findings suggest permutation decoy databases of size 1×105 are adequate to accurately detect neuropeptides and this can be exploited to increase the speed of the search. The straightforward Monte Carlo permutation testing (comparable to a zero order Markov model) can be easily combined with existing peptide identification software to enable accurate and effective neuropeptide detection. The source code is available at http://stagbeetle.animal.uiuc.edu/pepshop/MSMSpermutationtesting.
为支持质谱实验中神经肽的准确鉴定,采用了新型蒙特卡罗置换检验来计算显著性值。检验基于k重排列的诱饵数据库,其中k表示排列数。这些数据库与来自三种流行的开源数据库搜索软件(OMSSA、Crux和X! Tandem)的一系列肽段鉴定指标相结合,以评估神经肽谱匹配的统计显著性。显著性p值计算为数据库中匹配指标值优于或等于真实目标谱的序列比例。当应用于所有已知的手动注释小鼠神经肽测试平台时,使用k重排列诱饵数据库的置换检验在p值<10^(-5)时可鉴定出高达100%的神经肽。使用超得分(X! Tandem)、E值(OMSSA)和Sp得分(Crux)匹配指标的置换检验p值优于所有其他匹配指标。直观指标“实验光谱与理论光谱之间匹配离子数”检测肽段的稳健性能突出了在p值临界显著时考虑该指标的重要性。我们的研究结果表明,大小为1×10^5的置换诱饵数据库足以准确检测神经肽,并且可以利用这一点来提高搜索速度。直接的蒙特卡罗置换检验(类似于零阶马尔可夫模型)可以轻松地与现有的肽段鉴定软件相结合,以实现准确有效的神经肽检测。源代码可在http://stagbeetle.animal.uiuc.edu/pepshop/MSMSpermutationtesting获取。