Department of Bioengineering, University of California, Los Angeles, Los Angeles, CA 90024, USA.
Department of Bioinformatics, University of California, Los Angeles, Los Angeles, CA 90024, USA.
Cell Rep Methods. 2022 Feb 28;2(2). doi: 10.1016/j.crmeth.2022.100167. Epub 2022 Feb 14.
Cell signaling is orchestrated in part through a network of protein kinases and phosphatases. Dysregulation of kinase signaling is widespread in diseases such as cancer and is readily targetable through inhibitors. Mass spectrometry-based analysis can provide a global view of kinase regulation, but mining these data is complicated by its stochastic coverage of the proteome, measurement of substrates rather than kinases, and the scale of the data. Here, we implement a dual data and motif clustering (DDMC) strategy that simultaneously clusters peptides into similarly regulated groups based on their variation and their sequence profile. We show that this can help to identify putative upstream kinases and supply more robust clustering. We apply this clustering to clinical proteomic profiling of lung cancer and identify conserved proteomic signatures of tumorigenicity, genetic mutations, and immune infiltration. We propose that DDMC provides a general and flexible clustering strategy for the analysis of phosphoproteomic data.
细胞信号转导部分是通过一个蛋白质激酶和磷酸酶网络来协调的。激酶信号的失调在癌症等疾病中广泛存在,并且可以通过抑制剂很容易地靶向。基于质谱的分析可以提供激酶调节的全局视图,但由于其对蛋白质组的随机覆盖、对底物而不是激酶的测量以及数据的规模,使得对这些数据的挖掘变得复杂。在这里,我们实施了一种双重数据和基序聚类(DDMC)策略,该策略可以根据肽的变化及其序列特征将肽同时聚类到类似调节的组中。我们表明,这有助于识别潜在的上游激酶,并提供更稳健的聚类。我们将这种聚类应用于肺癌的临床蛋白质组学分析,并确定了肿瘤发生、遗传突变和免疫浸润的保守蛋白质组特征。我们提出 DDMC 为磷酸蛋白质组学数据的分析提供了一种通用且灵活的聚类策略。