Jiang Songhao, Shi Jiahui, Li Yanchang, Zhang Zhenpeng, Chang Lei, Wang Guibin, Wu Wenhui, Yu Liyan, Dai Erhei, Zhang Lixia, Lyu Zhitang, Xu Ping, Zhang Yao
Key Laboratory of Microbial Diversity Research and Application of Hebei, School of Life Sciences, Hebei University, Baoding, China.
Beijing Proteome Research Center, National Center for Protein Sciences Beijing, State Key Laboratory of Proteomics, Research Unit of Proteomics and Research and Development of New Drug of Chinese Academy of Medical Sciences, Institute of Lifeomics, Beijing, China.
Front Microbiol. 2022 Oct 12;13:1015140. doi: 10.3389/fmicb.2022.1015140. eCollection 2022.
Accurate identification of novel peptides remains challenging because of the lack of evaluation criteria in large-scale proteogenomic studies. Mirror proteases of trypsin and lysargiNase can generate complementary / ion series, providing the opportunity to efficiently assess authentic novel peptides in experiments other than filter potential targets by different false discovery rates (FDRs) ranking. In this study, a pair of in-house developed acetylated mirror proteases, Ac-Trypsin and Ac-LysargiNase, were used in MC 155 for proteogenomic analysis. The mirror proteases accurately identified 368 novel peptides, exhibiting 75-80% and ion coverages against 65-68% or ion coverages of Ac-Trypsin (38.9% and 68.3% ) or Ac-LysargiNase (65.5% and 39.6% ) as annotated peptides from MC 155. The complementary and ion series largely increased the reliability of overlapped sequences derived from novel peptides. Among these novel peptides, 311 peptides were annotated in other public strains, and 57 novel peptides with more continuous and pairs were obtained for further analysis after spectral quality assessment. This enabled mirror proteases to successfully correct six annotated proteins' N-termini and detect 17 new coding open reading frames (ORFs). We believe that mirror proteases will be an effective strategy for novel peptide detection in both prokaryotic and eukaryotic proteogenomics.
由于大规模蛋白质基因组学研究中缺乏评估标准,准确鉴定新型肽段仍然具有挑战性。胰蛋白酶和赖氨酸精氨酸酶的镜像蛋白酶可以生成互补/离子系列,为在实验中通过不同错误发现率(FDR)排名筛选潜在靶点之外,有效评估真实的新型肽段提供了机会。在本研究中,一对内部开发的乙酰化镜像蛋白酶,即乙酰化胰蛋白酶(Ac-Trypsin)和乙酰化赖氨酸精氨酸酶(Ac-LysargiNase),被用于MC 155的蛋白质基因组分析。镜像蛋白酶准确鉴定出368个新型肽段,相对于来自MC 155的注释肽段,Ac-Trypsin(38.9%和68.3%)或Ac-LysargiNase(65.5%和39.6%)的75-80%和离子覆盖率,其显示出65-68%和离子覆盖率。互补的和离子系列大大提高了源自新型肽段的重叠序列的可靠性。在这些新型肽段中,311个肽段在其他公共菌株中被注释,经过光谱质量评估后,获得了57个具有更多连续和对的新型肽段用于进一步分析。这使得镜像蛋白酶能够成功校正六个注释蛋白的N端,并检测到17个新的编码开放阅读框(ORF)。我们相信,镜像蛋白酶将成为原核和真核蛋白质基因组学中新型肽段检测的有效策略。