Shah Tarsh, Fitzpatrick Jackson A, Orsburn Benjamin C
The Advanced Academics Biotechnology Program, The Johns Hopkins University School of Medicine, Baltimore, Maryland 21205, United States.
Proteomic ünd Genomic Sciences, Baltimore, Maryland 21214, United States.
J Proteome Res. 2025 Sep 5;24(9):4838-4844. doi: 10.1021/acs.jproteome.5c00342. Epub 2025 Jul 28.
Nearly all methods of mass-spectrometry-based proteomics rely on knowing the proteome of the species. In less studied organisms without annotated genomes, it can seem impossible to perform proteomic analysis. In this study, we sought to answer the question: does enough information exist to do proteomics on any organism we want? As a case study, we started with material available due to an infestation of a home with black widow spiders. Thanks to the recent publication of an annotated genome for one species of black widow spider, we were able to identify 5502 protein groups and assign putative annotations using ortholog mapping. We also demonstrate that had we not had this resource, over 2000 proteins could be identified using other available spider genome annotations, despite their unrelatedness. Moreover, regardless of the spider proteome used, proteins annotated as toxins were almost exclusively observed in the main body of the mature female black widow spider. Overall, these results provide a draft proteome map for the black widow spider and valuable data for validating machine learning models while also suggesting that the door to insightful quantitative proteomics may already be open for millions of less studied organisms. All raw and processed proteomic data are available through the ProteomeXchange repository as accession PXD051601.
几乎所有基于质谱的蛋白质组学方法都依赖于了解物种的蛋白质组。在基因组未注释的研究较少的生物体中,进行蛋白质组学分析似乎是不可能的。在本研究中,我们试图回答这个问题:是否存在足够的信息对任何我们想要研究的生物体进行蛋白质组学研究?作为一个案例研究,我们从一个因黑寡妇蜘蛛侵扰而获得的材料开始。由于最近公布了一种黑寡妇蜘蛛的注释基因组,我们能够鉴定出5502个蛋白质组,并使用直系同源映射进行推定注释。我们还证明,如果没有这个资源,尽管其他蜘蛛基因组注释与目标物种不相关,但仍可鉴定出2000多种蛋白质。此外,无论使用哪种蜘蛛蛋白质组,被注释为毒素的蛋白质几乎只在成熟雌性黑寡妇蜘蛛的主体中观察到。总体而言,这些结果提供了黑寡妇蜘蛛的蛋白质组草图和用于验证机器学习模型的有价值数据,同时也表明,对于数百万研究较少的生物体来说,深入的定量蛋白质组学之门可能已经打开。所有原始和处理后的蛋白质组学数据可通过ProteomeXchange储存库获取,登录号为PXD051601。