He Jonathan, Liu Olivia, Guo Xuan
Department of Computer Science and Engineering, Univeristy of North Texas, Denton, USA.
Proceedings (IEEE Int Conf Bioinformatics Biomed). 2022 Dec;2022:2342-2348. doi: 10.1109/bibm55620.2022.9995258. Epub 2023 Jan 2.
Accuracy of peptide identification in LC-MS analysis is crucial for information regarding the aspects of proteins that aid in biomarker discovery and the profiling of complex proteomes. The detection of peptide fragment ions in tandem mass spectrometry is still challenging given that current tools were not created or tested for the low-abundance, low-peak fragments of peptides found in MS2 data. Feature detection, a crucial pre-processing step in the LC-MS analysis pipeline that quantifies peptides by their mass-to-charge ratio, retention time, and intensity, is particularly challenging due to the overlapping nature of peptides and weak signals that are often indistinguishable from noises, thus creating a reliance on rigid mathematical structures and heuristics. In this study, we developed a deep-learning-based model with an innovative sliding window process that enables high-resolution processing of quantitative MS/MS data to conduct MS2 feature detection. Experimental results show that our model can produce more accurate values and identifications than existing feature detection tools, as well as a high rate of true positive features quantified. Therefore, we believe that our model illustrates the advantages of deep learning techniques applied towards computational proteomics.
液相色谱-质谱联用(LC-MS)分析中肽段鉴定的准确性对于获取有助于生物标志物发现及复杂蛋白质组分析的蛋白质相关信息至关重要。鉴于当前工具并非针对在二级质谱(MS2)数据中发现的低丰度、低峰肽段碎片而创建或测试,串联质谱中肽段碎片离子的检测仍然具有挑战性。特征检测是LC-MS分析流程中的一个关键预处理步骤,它通过肽段的质荷比、保留时间和强度来量化肽段,由于肽段的重叠性质以及常常与噪声难以区分的微弱信号,这一过程尤其具有挑战性,因此依赖于严格的数学结构和启发式方法。在本研究中,我们开发了一种基于深度学习的模型,该模型采用创新的滑动窗口过程,能够对定量MS/MS数据进行高分辨率处理以进行MS2特征检测。实验结果表明,我们的模型能够比现有特征检测工具产生更准确的值和鉴定结果,以及具有高量化真阳性特征率。因此,我们认为我们的模型展示了深度学习技术在计算蛋白质组学中的优势。