Suppr超能文献

PepQuery 可实现对新型基因组改变的快速、准确和便捷的蛋白质组学验证。

PepQuery enables fast, accurate, and convenient proteomic validation of novel genomic alterations.

机构信息

Lester and Sue Smith Breast Center, Baylor College of Medicine, Houston, Texas 77030, USA.

Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas 77030, USA.

出版信息

Genome Res. 2019 Mar;29(3):485-493. doi: 10.1101/gr.235028.118. Epub 2019 Jan 4.

Abstract

Massively parallel or second-generation sequencing-based genomic studies continuously identify new genomic alterations that may lead to novel protein sequences, which are attractive candidates for disease biomarkers and therapeutic targets after proteomic validation. Integrative proteogenomic methods have been developed to use mass spectrometry (MS)-based proteomics data for such validation. These methods replace the reference sequence database in proteomic database searching with a customized protein database that incorporates sample- or disease-specific sequences derived from DNA or RNA sequencing, thus enabling the identification of novel protein sequences. Although useful, this spectrum-centric approach requires a full evaluation of all possible spectrum-peptide pairs, which is time-consuming, error-prone, and difficult to apply. Here, we present PepQuery, a peptide-centric approach that focuses on only novel DNA or protein sequences of interest. PepQuery allows quick and easy proteomic validation of genomic alterations without customized database construction. We demonstrated the sensitivity and specificity of the approach in validating completely novel proteins, novel splice junctions, and single amino acid variants using simulations and experimental data. Notably, enabling unrestricted modification searching in PepQuery reduced false positives by up to 95%. We implemented PepQuery as both web-based and stand-alone applications. The web version provides direct access to more than half a billion MS/MS spectra from the Clinical Proteomic Tumor Analysis Consortium (CPTAC) and other cancer proteomic studies. The stand-alone version supports batch analysis and user-provided MS/MS data. PepQuery will increase the usage of proteogenomics beyond the proteomics community and will broaden the application of proteogenomics in personalized medicine.

摘要

基于大规模平行或第二代测序的基因组研究不断鉴定出新的基因组改变,这些改变可能导致新的蛋白质序列,这些序列在蛋白质组学验证后成为疾病生物标志物和治疗靶点的有吸引力的候选物。整合的蛋白质基因组学方法已经被开发出来,用于使用基于质谱(MS)的蛋白质组学数据进行这种验证。这些方法用包含来自 DNA 或 RNA 测序的样本或疾病特异性序列的定制蛋白质数据库替代蛋白质组学数据库搜索中的参考序列数据库,从而能够鉴定新的蛋白质序列。虽然有用,但这种基于谱的方法需要对所有可能的谱-肽对进行全面评估,这既耗时、易错,又难以应用。在这里,我们提出了 PepQuery,一种专注于感兴趣的新 DNA 或蛋白质序列的肽基方法。PepQuery 允许在不构建定制数据库的情况下快速轻松地进行蛋白质组学验证基因组改变。我们使用模拟和实验数据证明了该方法在验证完全新的蛋白质、新的剪接连接和单个氨基酸变异方面的灵敏度和特异性。值得注意的是,在 PepQuery 中启用不受限制的修饰搜索可将假阳性减少多达 95%。我们将 PepQuery 实现为基于网络的和独立的应用程序。网络版本提供了对来自临床蛋白质组肿瘤分析联盟(CPTAC)和其他癌症蛋白质组学研究的超过 5 亿个 MS/MS 谱的直接访问。独立版本支持批量分析和用户提供的 MS/MS 数据。PepQuery 将增加蛋白质基因组学在蛋白质组学领域之外的使用,并拓宽蛋白质基因组学在个性化医疗中的应用。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验