Department of Biochemistry, Microbiology and Immunology, and Ottawa Institute of Systems Biology, Faculty of Medicine, University of Ottawa, 451 Smyth Road, Room 4170, Ottawa, ON, K1H 8M5, Canada.
BMC Bioinformatics. 2021 Jun 4;22(1):302. doi: 10.1186/s12859-021-04042-6.
Quantitative proteomics studies are often used to detect proteins that are differentially expressed across different experimental conditions. Functional enrichment analyses are then typically used to detect annotations, such as biological processes that are significantly enriched among such differentially expressed proteins to provide insights into the molecular impacts of the studied conditions. While common, this analytical pipeline often heavily relies on arbitrary thresholds of significance. However, a functional annotation may be dysregulated in a given experimental condition, while none, or very few of its proteins may be individually considered to be significantly differentially expressed. Such an annotation would therefore be missed by standard approaches.
Herein, we propose a novel graph theory-based method, PIGNON, for the detection of differentially expressed functional annotations in different conditions. PIGNON does not assess the statistical significance of the differential expression of individual proteins, but rather maps protein differential expression levels onto a protein-protein interaction network and measures the clustering of proteins from a given functional annotation within the network. This process allows the detection of functional annotations for which the proteins are differentially expressed and grouped in the network. A Monte-Carlo sampling approach is used to assess the clustering significance of proteins in an expression-weighted network. When applied to a quantitative proteomics analysis of different molecular subtypes of breast cancer, PIGNON detects Gene Ontology terms that are both significantly clustered in a protein-protein interaction network and differentially expressed across different breast cancer subtypes. PIGNON identified functional annotations that are dysregulated and clustered within the network between the HER2+, triple negative and hormone receptor positive subtypes. We show that PIGNON's results are complementary to those of state-of-the-art functional enrichment analyses and that it highlights functional annotations missed by standard approaches. Furthermore, PIGNON detects functional annotations that have been previously associated with specific breast cancer subtypes.
PIGNON provides an alternative to functional enrichment analyses and a more comprehensive characterization of quantitative datasets. Hence, it contributes to yielding a better understanding of dysregulated functions and processes in biological samples under different experimental conditions.
定量蛋白质组学研究常用于检测不同实验条件下差异表达的蛋白质。然后通常使用功能富集分析来检测注释,例如在这些差异表达蛋白中显著富集的生物学过程,以提供对研究条件下分子影响的深入了解。虽然这种分析方法很常见,但它通常严重依赖于任意的显著性阈值。然而,在给定的实验条件下,一个功能注释可能失调,而其蛋白质中没有一个或很少一个被单独认为是显著差异表达的。因此,这种注释将被标准方法所忽略。
在此,我们提出了一种新的基于图论的方法 PIGNON,用于检测不同条件下差异表达的功能注释。PI-GNON 不评估单个蛋白质差异表达的统计显著性,而是将蛋白质差异表达水平映射到蛋白质-蛋白质相互作用网络上,并测量网络中来自给定功能注释的蛋白质的聚类。这个过程允许检测出蛋白质在网络中差异表达并分组的功能注释。使用蒙特卡罗抽样方法评估表达加权网络中蛋白质的聚类显著性。当应用于不同乳腺癌分子亚型的定量蛋白质组学分析时,PI-GNON 检测到在蛋白质-蛋白质相互作用网络中显著聚类且在不同乳腺癌亚型中差异表达的基因本体术语。PI-GNON 确定了在 HER2+、三阴性和激素受体阳性亚型之间在网络内失调和聚类的功能注释。我们表明,PI-GNON 的结果与最新的功能富集分析结果互补,并且它突出了标准方法错过的功能注释。此外,PI-GNON 检测到先前与特定乳腺癌亚型相关的功能注释。
PI-GNON 为功能富集分析提供了替代方法,并更全面地描述了定量数据集。因此,它有助于更好地理解不同实验条件下生物样本中失调的功能和过程。