Biology Department, Reed College, Portland, Oregon, USA.
Pac Symp Biocomput. 2022;27:211-222.
A major goal of molecular systems biology is to understand the coordinated function of genes or proteins in response to cellular signals and to understand these dynamics in the context of disease. Signaling pathway databases such as KEGG, NetPath, NCI-PID, and Panther describe the molecular interactions involved in different cellular responses. While the same pathway may be present in different databases, prior work has shown that the particular proteins and interactions differ across database annotations. However, to our knowledge no one has attempted to quantify their structural differences. It is important to characterize artifacts or other biases within pathway databases, which can provide a more informed interpretation for downstream analyses. In this work we consider signaling pathways as graphs and we use topological measures to study their structure. We find that topological characterization using graphlets (small, connected subgraphs) distinguishes signaling pathways from appropriate null models of interaction networks. Next, we quantify topological similarity across pathway databases. Our analysis reveals that the pathways harbor database-specific characteristics implying that even though these databases describe the same pathways, they tend to be systematically different from one another. We show that pathway-specific topology can be uncovered after accounting for database-specific structure. This work presents the first step towards elucidating common pathway structure beyond their specific database annotations.Data Availability: https://github.com/Reed-CompBio/pathway-reconciliation.
分子系统生物学的主要目标是了解基因或蛋白质在响应细胞信号时的协调功能,并在疾病的背景下了解这些动态。KEGG、NetPath、NCI-PID 和 Panther 等信号通路数据库描述了不同细胞反应中涉及的分子相互作用。虽然相同的途径可能存在于不同的数据库中,但先前的工作表明,特定的蛋白质和相互作用在数据库注释中存在差异。然而,据我们所知,没有人试图量化它们的结构差异。重要的是要描述通路数据库中的伪影或其他偏差,这可以为下游分析提供更明智的解释。在这项工作中,我们将信号通路视为图,并使用拓扑度量来研究它们的结构。我们发现,使用图元(小的、连通的子图)进行拓扑特征描述可以将信号通路与交互网络的适当空模型区分开来。接下来,我们量化了通路数据库之间的拓扑相似性。我们的分析表明,这些途径具有数据库特有的特征,这意味着即使这些数据库描述了相同的途径,它们彼此之间也往往存在系统差异。我们表明,在考虑到特定数据库的结构后,可以揭示途径特有的拓扑结构。这项工作是阐明共同途径结构的第一步,超越了它们特定的数据库注释。