Suppr超能文献

使用复杂度降低的以肽段为中心的数据库提高鸟枪法蛋白质组学的灵敏度:基于MS/MS谱数据挖掘的蛋白酶切割和强阳离子交换洗脱规则

Improving sensitivity in shotgun proteomics using a peptide-centric database with reduced complexity: protease cleavage and SCX elution rules from data mining of MS/MS spectra.

作者信息

Yen Chia-Yu, Russell Steve, Mendoza Alex M, Meyer-Arendt Karen, Sun Shaojun, Cios Krzysztof J, Ahn Natalie G, Resing Katheryn A

机构信息

Department of Computer Science and Engineering, University of Colorado at Denver and Health Sciences Center, Denver, CO 80217-3364, USA.

出版信息

Anal Chem. 2006 Feb 15;78(4):1071-84. doi: 10.1021/ac051127f.

Abstract

Correct identification of a peptide sequence from MS/MS data is still a challenging research problem, particularly in proteomic analyses of higher eukaryotes where protein databases are large. The scoring methods of search programs often generate cases where incorrect peptide sequences score higher than correct peptide sequences (referred to as distraction). Because smaller databases yield less distraction and better discrimination between correct and incorrect assignments, we developed a method for editing a peptide-centric database (PC-DB) to remove unlikely sequences and strategies for enabling search programs to utilize this peptide database. Rules for unlikely missed cleavage and nontryptic proteolysis products were identified by data mining 11 849 high-confidence peptide assignments. We also evaluated ion exchange chromatographic behavior as an editing criterion to generate subset databases. When used to search a well-annotated test data set of MS/MS spectra, we found no loss of critical information using PC-DBs, validating the methods for generating and searching against the databases. On the other hand, improved confidence in peptide assignments was achieved for tryptic peptides, measured by changes in DeltaCN and RSP. Decreased distraction was also achieved, consistent with the 3-9-fold decrease in database size. Data mining identified a major class of common nonspecific proteolytic products corresponding to leucine aminopeptidase (LAP) cleavages. Large improvements in identifying LAP products were achieved using the PC-DB approach when compared with conventional searches against protein databases. These results demonstrate that peptide properties can be used to reduce database size, yielding improved accuracy and information capture due to reduced distraction, but with little loss of information compared to conventional protein database searches.

摘要

从串联质谱(MS/MS)数据中正确识别肽序列仍然是一个具有挑战性的研究问题,特别是在高等真核生物的蛋白质组分析中,其蛋白质数据库非常庞大。搜索程序的评分方法常常会出现错误的肽序列得分高于正确肽序列的情况(称为干扰)。由于较小的数据库产生的干扰较少,并且在正确和错误分配之间具有更好的区分度,我们开发了一种编辑以肽为中心的数据库(PC-DB)的方法,以去除不太可能的序列,以及使搜索程序能够利用此肽数据库的策略。通过对11849个高可信度肽分配的数据挖掘,确定了不太可能的漏切和非胰蛋白酶解产物的规则。我们还评估了离子交换色谱行为作为生成子集数据库的编辑标准。当用于搜索一个注释良好的MS/MS谱测试数据集时,我们发现使用PC-DB没有关键信息的丢失,验证了生成和搜索数据库的方法。另一方面,通过DeltaCN和RSP的变化来衡量,胰蛋白酶肽的肽分配置信度得到了提高。干扰也有所降低,这与数据库大小减少3至9倍一致。数据挖掘确定了一类主要的常见非特异性蛋白水解产物,对应于亮氨酸氨肽酶(LAP)的切割。与针对蛋白质数据库的传统搜索相比,使用PC-DB方法在识别LAP产物方面有了很大改进。这些结果表明,肽的特性可用于减小数据库大小,由于干扰减少,从而提高准确性和信息捕获能力,但与传统蛋白质数据库搜索相比,信息损失很小。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验