Department of Biochemistry and Molecular Biology, The University of Texas Medical Branch, Galveston, Texas 77555, United States.
Anal Chem. 2012 Mar 20;84(6):3026-32. doi: 10.1021/ac203255e. Epub 2012 Mar 7.
Improvements in the mass accuracy and resolution of mass spectrometers have greatly aided mass spectrometry-based proteomics in profiling complex biological mixtures. With the use of innovative bioinformatics approaches, high mass accuracy and resolution information can be used for filtering chemical noise in mass spectral data. Using our recent algorithmic developments, we have generated the mass distributions of all theoretical tryptic peptides composed of 20 natural amino acids and with masses limited to 3.5 kDa. Peptide masses are distributed discretely, with well-defined peak clusters separated by empty or sparsely populated trough regions. Accurate models for peak centers and widths can be used to filter peptide signals from chemical noise. We modeled mass defects, the difference between monoisotopic and nominal masses, and peak centers and widths in the peptide mass distributions. We found that peak widths encompassing 95% of all peptide sequences are substantially smaller than previously thought. The result has implications for filtering out larger stretches of the mass axis. Mass defects of peptides exhibit an oscillatory behavior which is damped at high mass values. The periodicity of the oscillations is about 14 Da which is the most common difference between the masses of the 20 natural amino acids. To determine the effects of amino acid modifications on our findings, we examined the mass distributions of peptides composed of the 20 natural amino acids, oxidized Met, and phosphorylated Ser, Thr, and Tyr. We found that extension of the amino acid set by modifications increases the 95% peak width. Mass defects decrease, reflecting the fact that the average mass defect of natural amino acids is larger than that of oxidized Met. We propose that a new model for mass defects and peak widths of peptides may improve peptide identifications by filtering chemical noise in mass spectral data.
质谱仪的质量精度和分辨率的提高极大地促进了基于质谱的蛋白质组学在分析复杂生物混合物方面的应用。通过使用创新的生物信息学方法,可以将高质量精度和分辨率的信息用于过滤质谱数据中的化学噪声。利用我们最近的算法发展,我们生成了由 20 种天然氨基酸组成且质量限制在 3.5 kDa 以内的所有理论胰蛋白酶肽的质量分布。肽质量呈离散分布,具有定义明确的峰簇,峰簇之间有空隙或稀疏的低谷区。可以使用准确的峰中心和峰宽模型来过滤肽信号中的化学噪声。我们对质量缺陷(即单同位素和名义质量之间的差异)以及肽质量分布中的峰中心和峰宽进行了建模。我们发现,包含 95%所有肽序列的峰宽明显小于之前的预期。这一结果对过滤掉质量轴上更大的区域有影响。肽的质量缺陷表现出一种在高质量值处被阻尼的振荡行为。这种振荡的周期性约为 14 Da,这是 20 种天然氨基酸质量差异中最常见的一种。为了确定氨基酸修饰对我们研究结果的影响,我们检查了由 20 种天然氨基酸、氧化的 Met、磷酸化的 Ser、Thr 和 Tyr 组成的肽的质量分布。我们发现,通过修饰扩展氨基酸集增加了 95%的峰宽。质量缺陷减少,反映了天然氨基酸的平均质量缺陷大于氧化 Met 的事实。我们提出,一种新的肽质量缺陷和峰宽模型可以通过过滤质谱数据中的化学噪声来提高肽鉴定的准确性。