Xu Hua, Freitas Michael A
Department of Chemistry, the Ohio State University, Columbus 43210, OH, USA.
BMC Bioinformatics. 2007 Apr 20;8:133. doi: 10.1186/1471-2105-8-133.
Liquid chromatography coupled with tandem mass spectrometry (LC-MS/MS) has become one of the most used tools in mass spectrometry based proteomics. Various algorithms have since been developed to automate the process for modern high-throughput LC-MS/MS experiments.
A probability based statistical scoring model for assessing peptide and protein matches in tandem MS database search was derived. The statistical scores in the model represent the probability that a peptide match is a random occurrence based on the number or the total abundance of matched product ions in the experimental spectrum. The model also calculates probability based scores to assess protein matches. Thus the protein scores in the model reflect the significance of protein matches and can be used to differentiate true from random protein matches.
The model is sensitive to high mass accuracy and implicitly takes mass accuracy into account during scoring. High mass accuracy will not only reduce false positives, but also improves the scores of true positive matches. The algorithm is incorporated in an automated database search program MassMatrix.
液相色谱-串联质谱联用(LC-MS/MS)已成为基于质谱的蛋白质组学中最常用的工具之一。此后,人们开发了各种算法来自动化现代高通量LC-MS/MS实验的流程。
推导了一种基于概率的统计评分模型,用于评估串联质谱数据库搜索中的肽段和蛋白质匹配。该模型中的统计评分表示基于实验谱图中匹配产物离子的数量或总丰度,肽段匹配是随机出现的概率。该模型还计算基于概率的评分来评估蛋白质匹配。因此,模型中的蛋白质评分反映了蛋白质匹配的显著性,可用于区分真实的和随机的蛋白质匹配。
该模型对高质量精度敏感,并且在评分过程中隐含地考虑了质量精度。高质量精度不仅会减少假阳性,还会提高真阳性匹配的评分。该算法已整合到自动化数据库搜索程序MassMatrix中。