Prakash Ananth, Collins Andrew, Vilmovsky Liora, Fexova Silvie, Jones Andrew R, Vizcaino Juan Antonio
European Molecular Biology Laboratory-European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, U.K.
Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, U.K.
J Proteome Res. 2025 Feb 7;24(2):685-695. doi: 10.1021/acs.jproteome.4c00788. Epub 2025 Jan 7.
The PRIDE database is the largest public data repository of mass spectrometry-based proteomics data and currently stores more than 40,000 data sets covering a wide range of organisms, experimental techniques, and biological conditions. During the past few years, PRIDE has seen a significant increase in the amount of submitted data-independent acquisition (DIA) proteomics data sets. This provides an excellent opportunity for large-scale data reanalysis and reuse. We have reanalyzed 15 public label-free DIA data sets across various healthy human tissues to provide a state-of-the-art view of the human proteome in baseline conditions (without any perturbations). We computed baseline protein abundances and compared them across various tissues, samples, and data sets. Our second aim was to compare protein abundances obtained here from the results of previous analyses using human baseline data-dependent acquisition (DDA) data sets. We observed a good correlation across some tissues, especially in the liver and colon, but weak correlations were found in others, such as the lung and pancreas. The reanalyzed results including protein abundance values and curated metadata are made available to view and download from the resource Expression Atlas.
PRIDE数据库是基于质谱的蛋白质组学数据最大的公共数据存储库,目前存储了超过40,000个数据集,涵盖广泛的生物体、实验技术和生物学条件。在过去几年中,PRIDE提交的数据非依赖型采集(DIA)蛋白质组学数据集数量显著增加。这为大规模数据重新分析和再利用提供了绝佳机会。我们重新分析了来自各种健康人体组织的15个公共无标记DIA数据集,以呈现基线条件下(无任何干扰)人类蛋白质组的最新情况。我们计算了基线蛋白质丰度,并在各种组织、样本和数据集中进行了比较。我们的第二个目标是将这里获得的蛋白质丰度与之前使用人类基线数据依赖型采集(DDA)数据集的分析结果进行比较。我们观察到在某些组织中存在良好的相关性,尤其是在肝脏和结肠中,但在其他组织中,如肺和胰腺中,相关性较弱。重新分析的结果,包括蛋白质丰度值和经过整理的元数据,可从资源表达图谱中查看和下载。