Department of Chemistry, University of Florida, P.O. Box 117200, Gainesville, Florida 32611-7200, USA.
Anal Chem. 2011 Oct 15;83(20):8019-23. doi: 10.1021/ac201624t. Epub 2011 Sep 20.
We report trends in the theoretically derived number of compositionally distinct peptides (i.e., peptides made up of different amino acid residues) up to a nominal mass of 1000. A total of 21 amino acid residues commonly found in proteomics studies are included in this study, 19 natural, nonisomeric amino acid residues as well as oxidated methione and acetamidated cysteine. The number of possibilities is found to increase in an exponential fashion with increasing nominal mass, and the data show a periodic oscillation that starts at mass ~200 and continues throughout to 1000. Note that similar effects are reported in the companion article on fragment ions from electron capture/transfer dissociation (ECD/ETD) (Mao et al. Anal. Chem.2011, DOI: 10.1021/ac201619t). The spacing of this oscillation is ~15 mass units at lower masses and ~14 mass units at higher nominal masses. This correlates with the most common mass differences between the amino acid building blocks. In other words, some mass differences are more common than others, thus determining the periodicity in this data. From an analytical point of view, nominal masses with a larger number of compositionally distinct peptides include a substantial number of isomers, which cannot be separated based on mass. Consequently, even ultrahigh mass accuracy (i.e., 0.5 ppm) does not lead to a substantially enhanced rate of identification. Conversely, for adjacent nominal masses with a lower number of isomers, moderately accurate mass (i.e., 10 ppm) gives a higher degree of certainty in identification. These effects are limited to the mass range between 200 and 500 Da. At higher masses, the percentage of uniquely identified peptides drops off to close to zero, independent of nominal mass, due the inherently high number of isomers. While the exact number of isobars/isomers at each nominal mass depends on the amino acid building blocks that are considered, the periodicity in the data is found to be remarkably robust; for instance, inclusion of phosphorylated residues barely affects the pattern at lower masses (i.e., <500 Da).
我们报告了理论上推导的组成上不同的肽(即由不同氨基酸残基组成的肽)数量的趋势,其分子量高达 1000。本研究包括蛋白质组学研究中常见的 21 种氨基酸残基,19 种天然、非异构氨基酸残基以及氧化甲硫氨酸和乙酰化半胱氨酸。结果发现,随着分子量的增加,可能性数量呈指数增长,数据显示出一种周期性波动,从质量约 200 开始,一直持续到 1000。请注意,类似的效应在关于电子捕获/转移解离(ECD/ETD)片段离子的伴侣文章中也有报道(Mao 等人,分析化学,2011 年,DOI:10.1021/ac201619t)。在较低的质量下,这种波动的间隔约为 15 个质量单位,在较高的名义质量下约为 14 个质量单位。这与氨基酸构建块之间最常见的质量差异相对应。换句话说,一些质量差异比其他的更常见,从而决定了数据的周期性。从分析的角度来看,具有更多组成上不同肽的名义质量包括大量的异构体,这些异构体不能基于质量进行分离。因此,即使是超高的质量精度(即 0.5 ppm)也不会导致识别率显著提高。相反,对于具有较少异构体的相邻名义质量,中等准确的质量(即 10 ppm)可在识别方面提供更高的确定性。这些效应仅限于 200 到 500 Da 之间的质量范围。在更高的质量下,由于异构体质子数量较高,唯一识别的肽的百分比接近零,与名义质量无关。虽然每个名义质量的等重异构体/异构体的确切数量取决于所考虑的氨基酸构建块,但数据中的周期性被发现非常稳健;例如,包含磷酸化残基几乎不会影响较低质量(即<500 Da)下的模式。