Cosenza-Contreras Miguel, Seredynska Adrianna, Vogele Daniel, Pinter Niko, Brombacher Eva, Cueto Ruth Fiestas, Dinh Thien-Ly Julia, Bernhard Patrick, Rogg Manuel, Liu Junwei, Willems Patrick, Stael Simon, Huesgen Pitter F, Kuehn E Wolfgang, Kreutz Clemens, Schell Christoph, Schilling Oliver
Faculty of Biology, University of Freiburg, Freiburg, Germany.
Faculty of Medicine, Institute for Surgical Pathology Medical Center-University of Freiburg, Freiburg, Germany.
Proteomics. 2024 Oct;24(19):e2300491. doi: 10.1002/pmic.202300491. Epub 2024 Aug 10.
State-of-the-art mass spectrometers combined with modern bioinformatics algorithms for peptide-to-spectrum matching (PSM) with robust statistical scoring allow for more variable features (i.e., post-translational modifications) being reliably identified from (tandem-) mass spectrometry data, often without the need for biochemical enrichment. Semi-specific proteome searches, that enforce a theoretical enzymatic digestion to solely the N- or C-terminal end, allow to identify of native protein termini or those arising from endogenous proteolytic activity (also referred to as "neo-N-termini" analysis or "N-terminomics"). Nevertheless, deriving biological meaning from these search outputs can be challenging in terms of data mining and analysis. Thus, we introduce TermineR, a data analysis approach for the (1) annotation of peptides according to their enzymatic cleavage specificity and known protein processing features, (2) differential abundance and enrichment analysis of N-terminal sequence patterns, and (3) visualization of neo-N-termini location. We illustrate the use of TermineR by applying it to tandem mass tag (TMT)-based proteomics data of a mouse model of polycystic kidney disease, and assess the semi-specific searches for biological interpretation of cleavage events and the variable contribution of proteolytic products to general protein abundance. The TermineR approach and example data are available as an R package at https://github.com/MiguelCos/TermineR.
最先进的质谱仪与现代生物信息学算法相结合,用于肽段与质谱匹配(PSM),具备强大的统计评分功能,能够从(串联)质谱数据中可靠地识别更多可变特征(即翻译后修饰),通常无需进行生化富集。半特异性蛋白质组搜索,即仅对N端或C端进行理论酶切,可用于识别天然蛋白质末端或内源性蛋白水解活性产生的末端(也称为“新N端”分析或“N端蛋白质组学”)。然而,从这些搜索输出中挖掘生物学意义在数据挖掘和分析方面可能具有挑战性。因此,我们引入了TermineR,这是一种用于(1)根据肽段的酶切特异性和已知蛋白质加工特征进行肽段注释,(2)对N端序列模式进行差异丰度和富集分析,以及(3)可视化新N端位置的数据分析方法。我们通过将TermineR应用于多囊肾病小鼠模型的基于串联质谱标签(TMT)的蛋白质组学数据,来说明其用途,并评估半特异性搜索对裂解事件的生物学解释以及蛋白水解产物对总体蛋白质丰度的可变贡献。TermineR方法和示例数据可作为R包在https://github.com/MiguelCos/TermineR上获取。