IZKF Core Unit Proteomics, Interdisciplinary Center for Clinical Research, University of Münster, Röntgenstr. 21, 48149 Münster, Germany.
Institute of Physiological Chemistry and Pathobiochemistry, University of Münster, Waldeyer-Str. 15, 48149 Münster, Germany.
Molecules. 2022 Aug 5;27(15):4976. doi: 10.3390/molecules27154976.
(1) Background: The amino acid sequence elucidation of peptides from the gas phase fragmentation mass spectra, de novo sequencing, is a valuable method for the identification of unknown proteins complementary to Edman sequencing. It is increasingly used in shot-gun mass spectrometry (MS)-based proteomics experiments. We review the current state-of-the-art and use the identification of an unknown snake venom protein targeting the human tissue factor (TF) as an example to describe the analysis process based on manual spectrum interrogation. (2) Methods: The immobilized TF was incubated with a crude venom solution. The potential binding partners were eluted and further purified by gel electrophoresis. Edman degradation was performed to elucidate the N-terminus of the 31 kDa protein of interest. High-resolution MS with collision-induced dissociation was employed to generate peptide fragmentation spectra. Sequence tags were deduced and used for searches in the NCBI and Uniprot databases. Protein matches from the snake species were further validated by target MS/MS. (3) Results: Sequence tag D [K/Q] D [I/L] VDD [K/Q] led to a snake venom serine protease (SVSP) from lancehead (P81824). With target MS/MS, 24% of the SVSP sequence were confirmed; an additional 41% were tentatively assigned by data-independent MS. Edman sequencing provided information for 10 N-terminal amino acid residues, also confirming the match to SVSP. (4) Conclusions: The identification of unknown proteins continues to be a challenge despite major advances in MS instrumentation and bioinformatic tools. The main requirement is the generation of meaningful, high-quality MS peptide fragmentation spectra. These are used to elucidate sufficiently long sequence tags, which can subsequently be submitted to searches in protein databases. This basic method does not require extensive bioinformatics because peptide MS/MS spectra, especially of doubly-charged ions, can be analysed manually. We demonstrated the procedure with the elucidation of SVSP. While de novo sequencing quickly indicates the correct protein group, the validation of the entire protein sequence of amino acid-by-amino acid will take time. Reasons are the need to properly assign isobaric amino acid residues and modifications. With the ongoing efforts in genomics and transcriptomics and the availability of ever more data in public databases, the need for de novo MS sequencing will decrease. Still, not every animal and plant species will be sequenced, so the combination of MS and Edman sequencing will continue to be of importance for the identification of unknown proteins.
(1) 背景:从气相碎片质谱中推导肽的氨基酸序列,从头测序,是鉴定与 Edman 测序互补的未知蛋白质的一种有价值的方法。它越来越多地用于 shotgun 质谱(MS)-基于蛋白质组学实验。我们回顾了当前的最新技术,并以鉴定一种针对人组织因子(TF)的未知蛇毒蛋白为例,描述了基于手动谱询问的分析过程。
(2) 方法:固定化 TF 与粗毒液溶液孵育。潜在的结合伴侣通过凝胶电泳洗脱并进一步纯化。进行 Edman 降解以阐明 31 kDa 感兴趣蛋白的 N 末端。采用高分辨率 MS 与碰撞诱导解离生成肽片段光谱。推导序列标签并用于 NCBI 和 Uniprot 数据库搜索。从蛇种获得的蛋白质匹配物通过靶 MS/MS 进一步验证。
(3) 结果:序列标签 D [K/Q] D [I/L] VDD [K/Q] 导致矛头蝮蛇毒丝氨酸蛋白酶(SVSP)(P81824)。通过靶 MS/MS,SVSP 序列得到了 24%的确认;通过数据非依赖性 MS 还初步分配了 41%。Edman 测序为 10 个 N 末端氨基酸残基提供了信息,也证实了与 SVSP 的匹配。
(4) 结论:尽管 MS 仪器和生物信息学工具取得了重大进展,但未知蛋白质的鉴定仍然是一个挑战。主要要求是生成有意义的、高质量的 MS 肽片段光谱。这些被用来推导足够长的序列标签,随后可以提交到蛋白质数据库的搜索中。这种基本方法不需要广泛的生物信息学,因为肽 MS/MS 光谱,特别是双电荷离子,可以手动分析。我们通过 SVSP 的推导证明了该程序。虽然从头测序可以快速指示正确的蛋白质组,但逐个氨基酸验证整个蛋白质序列需要时间。原因是需要正确分配等电氨基酸残基和修饰。随着基因组学和转录组学的不断努力,以及公共数据库中越来越多的数据可用性,对从头 MS 测序的需求将会减少。尽管如此,并非每种动植物物种都会被测序,因此 MS 和 Edman 测序的结合将继续对鉴定未知蛋白质具有重要意义。