Zhu Zhikai, Su Xiaomeng, Go Eden P, Desaire Heather
Department of Chemistry, University of Kansas , Lawrence, Kansas 66047, United States.
Anal Chem. 2014 Sep 16;86(18):9212-9. doi: 10.1021/ac502176n. Epub 2014 Aug 28.
Glycoproteins are biologically significant large molecules that participate in numerous cellular activities. In order to obtain site-specific protein glycosylation information, intact glycopeptides, with the glycan attached to the peptide sequence, are characterized by tandem mass spectrometry (MS/MS) methods such as collision-induced dissociation (CID) and electron transfer dissociation (ETD). While several emerging automated tools are developed, no consensus is present in the field about the best way to determine the reliability of the tools and/or provide the false discovery rate (FDR). A common approach to calculate FDRs for glycopeptide analysis, adopted from the target-decoy strategy in proteomics, employs a decoy database that is created based on the target protein sequence database. Nonetheless, this approach is not optimal in measuring the confidence of N-linked glycopeptide matches, because the glycopeptide data set is considerably smaller compared to that of peptides, and the requirement of a consensus sequence for N-glycosylation further limits the number of possible decoy glycopeptides tested in a database search. To address the need to accurately determine FDRs for automated glycopeptide assignments, we developed GlycoPep Evaluator (GPE), a tool that helps to measure FDRs in identifying glycopeptides without using a decoy database. GPE generates decoy glycopeptides de novo for every target glycopeptide, in a 1:20 target-to-decoy ratio. The decoys, along with target glycopeptides, are scored against the ETD data, from which FDRs can be calculated accurately based on the number of decoy matches and the ratio of the number of targets to decoys, for small data sets. GPE is freely accessible for download and can work with any search engine that interprets ETD data of N-linked glycopeptides. The software is provided at https://desairegroup.ku.edu/research.
糖蛋白是参与众多细胞活动的具有生物学意义的大分子。为了获得位点特异性蛋白质糖基化信息,带有连接到肽序列上的聚糖的完整糖肽通过串联质谱(MS/MS)方法进行表征,如碰撞诱导解离(CID)和电子转移解离(ETD)。虽然开发了几种新兴的自动化工具,但该领域对于确定工具可靠性和/或提供错误发现率(FDR)的最佳方法尚未达成共识。一种从蛋白质组学中的目标-诱饵策略采用的计算糖肽分析FDR的常用方法,使用基于目标蛋白质序列数据库创建的诱饵数据库。然而,这种方法在测量N-连接糖肽匹配的置信度方面并非最佳,因为与肽数据集相比,糖肽数据集要小得多,并且N-糖基化共有序列的要求进一步限制了在数据库搜索中测试的可能诱饵糖肽的数量。为了满足准确确定自动化糖肽分配FDR的需求,我们开发了糖肽评估器(GPE),这是一种无需使用诱饵数据库即可帮助测量识别糖肽时FDR的工具。GPE以1:20的目标与诱饵比例为每个目标糖肽从头生成诱饵糖肽。将诱饵与目标糖肽一起根据ETD数据进行评分,对于小数据集,可以根据诱饵匹配的数量以及目标与诱饵数量的比例准确计算FDR。GPE可免费下载,并且可以与任何解释N-连接糖肽ETD数据的搜索引擎配合使用。该软件可在https://desairegroup.ku.edu/research获取。