Hutchings Charlotte, Krueger Thomas, Crook Oliver M, Gatto Laurent, Lilley Kathryn S, Breckels Lisa M
Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Cambridge, CB2 1QR, UK.
Department of Biochemistry, University of Cambridge, Cambridge, CB2 1QR, UK.
F1000Res. 2025 Jul 21;14:714. doi: 10.12688/f1000research.165543.1. eCollection 2025.
Subcellular localisation is a determining factor of protein function. Mass spectrometry-based correlation profiling experiments facilitate the classification of protein subcellular localisation on a proteome-wide scale. In turn, static localisations can be compared across conditions to identify differential protein localisation events.
Here, we provide a workflow for the processing and analysis of subcellular proteomics data derived from mass spectrometry-based correlation profiling experiments. This workflow utilises open-source R software packages from the Bioconductor project and provides extensive discussion of the key processing steps required to achieve high confidence protein localisation classifications and differential localisation predictions. The workflow is applicable to any correlation profiling data and supplementary code is provided to help users adapt the workflow to DDA and DIA data processed with different database softwares.
The workflow is divided into three sections. First we outline data processing using the QFeatures infrastructure to generate high quality protein correlation profiles. Next, protein subcellular localisation classification is carried out using machine learning. Finally, prediction of differential localisation events is covered for dynamic correlation profiling experiments.
A comprehensive start-to-end workflow for correlation profiling subcellular proteomics experiments is presented. : R version 4.5.0 (2025-04-11) : 3.21.
亚细胞定位是蛋白质功能的一个决定因素。基于质谱的相关性分析实验有助于在全蛋白质组范围内对蛋白质亚细胞定位进行分类。反过来,可以在不同条件下比较静态定位,以识别差异蛋白质定位事件。
在此,我们提供了一个工作流程,用于处理和分析源自基于质谱的相关性分析实验的亚细胞蛋白质组学数据。该工作流程利用来自Bioconductor项目的开源R软件包,并对实现高可信度蛋白质定位分类和差异定位预测所需的关键处理步骤进行了广泛讨论。该工作流程适用于任何相关性分析数据,并提供了补充代码,以帮助用户使工作流程适用于使用不同数据库软件处理的DDA和DIA数据。
该工作流程分为三个部分。首先,我们概述了使用QFeatures基础设施进行数据处理,以生成高质量的蛋白质相关性图谱。其次,使用机器学习进行蛋白质亚细胞定位分类。最后,涵盖了动态相关性分析实验中差异定位事件的预测。
提出了一个用于相关性分析亚细胞蛋白质组学实验的全面的端到端工作流程。:R版本4.5.0(2025-04-11):3.21。