Proteomics, Max Planck Institute of Biophysics, 60438 Frankfurt am Main, Germany.
Molecular Ecology, Agroscope & SIB Swiss Institute of Bioinformatics, 8046 Zürich, Switzerland.
Anal Chem. 2023 Aug 15;95(32):11892-11900. doi: 10.1021/acs.analchem.3c00676. Epub 2023 Aug 3.
Small proteins of around 50 aa in length have been largely overlooked in genetic and biochemical assays due to the inherent challenges with detecting and characterizing them. Recent discoveries of their critical roles in many biological processes have led to an increased recognition of the importance of small proteins for basic research and as potential new drug targets. One example is CcoM, a 36 aa subunit of the -type oxidase that plays an essential role in adaptation to oxygen-limited conditions in , a model for the clinically relevant, opportunistic pathogen . However, as no comprehensive data were available in , we devised an integrated, generic approach to study small proteins more systematically. Using the first complete genome as basis, we conducted bottom-up proteomics analyses and established a digest-free, direct-sequencing proteomics approach to study cells grown under aerobic and oxygen-limiting conditions. Finally, we also applied a proteogenomics pipeline to identify missed protein-coding genes. Overall, we identified 2921 known and 29 novel proteins, many of which were differentially regulated. Among 176 small proteins 16 were novel. Direct sequencing, featuring a specialized precursor acquisition scheme, exhibited advantages in the detection of small proteins with higher (up to 100%) sequence coverage and more spectral counts, including sequences with high proline content. Three novel small proteins, uniquely identified by direct sequencing and not conserved beyond , were predicted to form an operon with a conserved protein and may represent genes. These data demonstrate the power of this combined approach to study small proteins in and show its potential for other prokaryotes.
由于检测和表征它们的固有挑战,长度约为 50 个氨基酸的小蛋白在遗传和生化分析中在很大程度上被忽视了。最近发现它们在许多生物过程中的关键作用,导致人们越来越认识到小蛋白对于基础研究和作为潜在新药物靶点的重要性。一个例子是 CcoM,它是 - 型氧化酶的 36 个氨基酸亚基,在适应低氧条件中起着至关重要的作用, 是一种与临床相关的机会性病原体的模型。然而,由于在 中没有全面的数据,我们设计了一种综合的、通用的方法来更系统地研究小蛋白。我们使用第一个完整的基因组作为基础,进行了自上而下的蛋白质组学分析,并建立了一种无需消化的直接测序蛋白质组学方法来研究在有氧和低氧条件下生长的细胞。最后,我们还应用了一种蛋白质基因组学管道来识别错过的蛋白质编码基因。总的来说,我们鉴定了 2921 个已知和 29 个新的蛋白质,其中许多是差异调节的。在 176 个小蛋白中,有 16 个是新的。直接测序,具有专门的前体获取方案,在检测具有更高(高达 100%)序列覆盖率和更多谱计数的小蛋白方面具有优势,包括具有高脯氨酸含量的序列。通过直接测序唯一鉴定的三个新的小蛋白,并且在 之外没有保守性,它们被预测与一个保守的蛋白质形成一个操纵子,并且可能代表 个基因。这些数据证明了这种组合方法在 中研究小蛋白的强大功能,并展示了它在其他原核生物中的潜力。