Department of Microbial Proteomics, Institute of Microbiology, University of Greifswald, D-17489 Greifswald, Germany.
Agroscope, Research Group Molecular Diagnostics, Genomics & Bioinformatics and SIB Swiss Institute of Bioinformatics, CH-8820 Wädenswil, Switzerland.
J Proteome Res. 2020 Oct 2;19(10):4004-4018. doi: 10.1021/acs.jproteome.0c00286. Epub 2020 Sep 2.
Small open reading frame encoded proteins (SEPs) gained increasing interest during the last few years because of their broad range of important functions in both prokaryotes and eukaryotes. In bacteria, signaling, virulence, and regulation of enzyme activities have been associated with SEPs. Nonetheless, the number of SEPs detected in large-scale proteome studies is often low as classical methods are biased toward the identification of larger proteins. Here, we present a workflow that allows enhanced identification of small proteins compared to traditional protocols. For this aim, the steps of small protein enrichment, proteolytic digest, and database search were reviewed and adjusted to the special requirement of SEPs. Enrichment by the use of small-pore-sized solid-phase material increased the number of identified SEPs by a factor of 2, and utilization of alternative proteases to trypsin reduced the spectral counts for larger proteins. The application of the optimized protocol allowed the detection of 210 already annotated proteins up to 100 amino acids (aa) length, including 16 proteins below 51 aa in the Gram-positive model organism . Moreover, 12% of all identified proteins were up to 100 aa, which is a significantly larger fraction than that reported in studies involving traditional proteomics workflows. Finally, the application of an integrated proteogenomics search database and extensive subsequent validation resulted in the confident identification of three novel, not yet annotated, SEPs, which are 21, 26, and 42 aa long.
近年来,由于其在原核生物和真核生物中广泛的重要功能,小分子开放阅读框编码蛋白(SEP)引起了越来越多的关注。在细菌中,SEPs 与信号转导、毒力和酶活性的调节有关。尽管如此,在大规模蛋白质组学研究中检测到的 SEP 数量通常较低,因为经典方法偏向于鉴定更大的蛋白质。在这里,我们提出了一种工作流程,与传统方法相比,可以增强小分子蛋白质的鉴定。为此,我们对小分子富集、蛋白水解消化和数据库搜索的步骤进行了回顾和调整,以适应 SEP 的特殊要求。使用小孔径固相材料进行富集,将鉴定到的 SEP 数量增加了 2 倍,并且使用替代蛋白酶替代胰蛋白酶可以减少较大蛋白质的谱计数。优化方案的应用可以检测到 210 种已注释的蛋白质,长度可达 100 个氨基酸(aa),其中包括革兰氏阳性模型生物中 16 种长度小于 51 aa 的蛋白质。此外,所有鉴定到的蛋白质中有 12%的长度可达 100 aa,这明显大于涉及传统蛋白质组学工作流程的研究报告的比例。最后,应用集成的蛋白质基因组学搜索数据库和广泛的后续验证,可鉴定到三个新的、尚未注释的 SEP,它们分别长 21、26 和 42 个氨基酸。