Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, Washington 98195, United States.
Department of Genome Sciences, University of Washington, Seattle, Washington 98195, United States.
J Proteome Res. 2022 Jul 1;21(7):1771-1782. doi: 10.1021/acs.jproteome.2c00211. Epub 2022 Jun 13.
Quantitative mass spectrometry measurements of peptides necessarily incorporate sequence-specific biases that reflect the behavior of the peptide during enzymatic digestion and liquid chromatography and in a mass spectrometer. These sequence-specific effects impair quantification accuracy, yielding peptide quantities that are systematically under- or overestimated. We provide empirical evidence for the existence of such biases, and we use a deep neural network, called Pepper, to automatically identify and reduce these biases. The model generalizes to new proteins and new runs within a related set of tandem mass spectrometry experiments, and the learned coefficients themselves reflect expected physicochemical properties of the corresponding peptide sequences. The resulting adjusted abundance measurements are more correlated with mRNA-based gene expression measurements than the unadjusted measurements. Pepper is suitable for data generated on a variety of mass spectrometry instruments and can be used with labeled or label-free approaches and with data-independent or data-dependent acquisition.
肽的定量质谱测量必然包含反映肽在酶解、液相色谱和质谱中行为的序列特异性偏差。这些序列特异性效应会损害定量准确性,导致肽的数量被系统地低估或高估。我们提供了存在这种偏差的经验证据,并使用称为 Pepper 的深度神经网络来自动识别和减少这些偏差。该模型可推广到新的蛋白质和同一组串联质谱实验中的新运行,并且学习到的系数本身反映了相应肽序列的预期物理化学性质。由此产生的调整后的丰度测量值与基于 mRNA 的基因表达测量值的相关性比未经调整的测量值更高。Pepper 适用于各种质谱仪器生成的数据,可以与标记或无标记方法以及数据独立或数据依赖的采集方法一起使用。