Bouwmeester Robbin, Gabriels Ralf, Hulstaert Niels, Martens Lennart, Degroeve Sven
VIB-UGent Center for Medical Biotechnology, VIB, Ghent, Belgium.
Department of Biomolecular Medicine, Ghent University, Ghent, Belgium.
Nat Methods. 2021 Nov;18(11):1363-1369. doi: 10.1038/s41592-021-01301-5. Epub 2021 Oct 28.
The inclusion of peptide retention time prediction promises to remove peptide identification ambiguity in complex liquid chromatography-mass spectrometry identification workflows. However, due to the way peptides are encoded in current prediction models, accurate retention times cannot be predicted for modified peptides. This is especially problematic for fledgling open searches, which will benefit from accurate retention time prediction for modified peptides to reduce identification ambiguity. We present DeepLC, a deep learning peptide retention time predictor using peptide encoding based on atomic composition that allows the retention time of (previously unseen) modified peptides to be predicted accurately. We show that DeepLC performs similarly to current state-of-the-art approaches for unmodified peptides and, more importantly, accurately predicts retention times for modifications not seen during training. Moreover, we show that DeepLC's ability to predict retention times for any modification enables potentially incorrect identifications to be flagged in an open search of a wide variety of proteome data.
纳入肽保留时间预测有望消除复杂液相色谱 - 质谱鉴定工作流程中的肽鉴定歧义。然而,由于当前预测模型中肽的编码方式,无法对修饰肽的准确保留时间进行预测。这对于新兴的开放搜索来说尤其成问题,而开放搜索将受益于对修饰肽的准确保留时间预测以减少鉴定歧义。我们提出了DeepLC,这是一种基于原子组成进行肽编码的深度学习肽保留时间预测器,能够准确预测(之前未见过的)修饰肽的保留时间。我们表明,DeepLC对于未修饰肽的表现与当前最先进的方法类似,更重要的是,它能准确预测训练期间未见过的修饰的保留时间。此外,我们表明DeepLC预测任何修饰保留时间的能力能够在对各种蛋白质组数据的开放搜索中标记潜在错误的鉴定。