Vizcaíno Juan Antonio, Reisinger Florian, Côté Richard, Martens Lennart
European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK.
Methods Mol Biol. 2011;696:93-105. doi: 10.1007/978-1-60761-987-1_6.
The Proteomics Identifications Database (PRIDE, http://www.ebi.ac.uk/pride ) provides users with the ability to explore and compare mass spectrometry-based proteomics experiments that reveal details of the protein expression found in a broad range of taxonomic groups, tissues, and disease states. A PRIDE experiment typically includes identifications of proteins, peptides, and protein modifications. Additionally, many of the submitted experiments also include the mass spectra that provide the evidence for these identifications. Finally, one of the strongest advantages of PRIDE in comparison with other proteomics repositories is the amount of metadata it contains, a key point to put the above-mentioned data in biological and/or technical context. Several informatics tools have been developed in support of the PRIDE database. The most recent one is called "Database on Demand" (DoD), which allows custom sequence databases to be built in order to optimize the results from search engines. We describe the use of DoD in this chapter. Additionally, in order to show the potential of PRIDE as a source for data mining, we also explore complex queries using federated BioMart queries to integrate PRIDE data with other resources, such as Ensembl, Reactome, or UniProt.
蛋白质组学鉴定数据库(PRIDE,http://www.ebi.ac.uk/pride )为用户提供了探索和比较基于质谱的蛋白质组学实验的能力,这些实验揭示了广泛分类群、组织和疾病状态中蛋白质表达的详细信息。PRIDE实验通常包括蛋白质、肽段和蛋白质修饰的鉴定。此外,许多提交的实验还包括为这些鉴定提供证据的质谱图。最后,与其他蛋白质组学数据库相比,PRIDE最显著的优势之一在于其包含的元数据量,这是将上述数据置于生物学和/或技术背景下的关键。为支持PRIDE数据库,已开发了多种信息学工具。最新的一种工具名为“按需数据库”(DoD),它允许构建定制序列数据库,以优化搜索引擎的结果。我们将在本章中介绍DoD的使用方法。此外,为展示PRIDE作为数据挖掘来源的潜力,我们还将探索使用联合BioMart查询进行复杂查询,以将PRIDE数据与其他资源(如Ensembl、Reactome或UniProt)整合。