Robotham Scott A, Horton Andrew P, Cannon Joe R, Cotham Victoria C, Marcotte Edward M, Brodbelt Jennifer S
Department of Chemistry, University of Texas , Austin, Texas 78712, United States.
Center for Systems and Synthetic Biology, Department of Molecular Biosciences, University of Texas , Austin, Texas 78712, United States.
Anal Chem. 2016 Apr 5;88(7):3990-7. doi: 10.1021/acs.analchem.6b00261. Epub 2016 Mar 14.
De novo peptide sequencing by mass spectrometry represents an important strategy for characterizing novel peptides and proteins, in which a peptide's amino acid sequence is inferred directly from the precursor peptide mass and tandem mass spectrum (MS/MS or MS(3)) fragment ions, without comparison to a reference proteome. This method is ideal for organisms or samples lacking a complete or well-annotated reference sequence set. One of the major barriers to de novo spectral interpretation arises from confusion of N- and C-terminal ion series due to the symmetry between b and y ion pairs created by collisional activation methods (or c, z ions for electron-based activation methods). This is known as the "antisymmetric path problem" and leads to inverted amino acid subsequences within a de novo reconstruction. Here, we combine several key strategies for de novo peptide sequencing into a single high-throughput pipeline: high-efficiency carbamylation blocks lysine side chains, and subsequent tryptic digestion and N-terminal peptide derivatization with the ultraviolet chromophore AMCA yield peptides susceptible to 351 nm ultraviolet photodissociation (UVPD). UVPD-MS/MS of the AMCA-modified peptides then predominantly produces y ions in the MS/MS spectra, specifically addressing the antisymmetric path problem. Finally, the program UVnovo applies a random forest algorithm to automatically learn from and then interpret UVPD mass spectra, passing results to a hidden Markov model for de novo sequence prediction and scoring. We show this combined strategy provides high-performance de novo peptide sequencing, enabling the de novo sequencing of thousands of peptides from an Escherichia coli lysate at high confidence.
通过质谱进行从头肽测序是表征新型肽和蛋白质的重要策略,其中肽的氨基酸序列直接从前体肽质量和串联质谱(MS/MS或MS(3))碎片离子推断得出,无需与参考蛋白质组进行比较。该方法适用于缺乏完整或注释良好的参考序列集的生物体或样品。从头光谱解释的主要障碍之一源于N端和C端离子系列的混淆,这是由于碰撞激活方法产生的b和y离子对(或基于电子的激活方法产生的c、z离子)之间的对称性所致。这被称为“反对称路径问题”,会导致从头重建中氨基酸子序列的倒置。在此,我们将几种用于从头肽测序的关键策略整合到一个单一的高通量流程中:高效氨甲酰化可阻断赖氨酸侧链,随后进行胰蛋白酶消化以及用紫外发色团AMCA对肽的N端进行衍生化,从而产生易于进行351 nm紫外光解离(UVPD)的肽。AMCA修饰肽的UVPD-MS/MS随后在MS/MS谱图中主要产生y离子,专门解决了反对称路径问题。最后,UVnovo程序应用随机森林算法自动从UVPD质谱图中学习并进行解释,将结果传递给隐马尔可夫模型进行从头序列预测和评分。我们证明这种组合策略可提供高性能的从头肽测序,能够以高置信度对来自大肠杆菌裂解物的数千种肽进行从头测序。