Mei Suyu, Flemington Erik K, Zhang Kun
Software College, Shenyang Normal University, Shenyang, 110034, China.
Integr Biol (Camb). 2017 Jul 17;9(7):595-606. doi: 10.1039/c7ib00013h.
Recognition of indirect interactions is instrumental to in silico reconstruction of signaling pathways and sheds light on the exploration of unknown physical paths between two indirectly interacting genes. However, very limited computational methods have explicitly exploited the indirect interactions with experimental evidence thus far. In this work, we attempt to distinguish direct versus indirect interactions in human functional protein-protein interaction (PPI) networks via a predictive l-regularized logistic regression model built on the experimental data. The l-regularized logistic regression method is adopted to counteract the potential homolog noise and reduce the computational complexity on large training data. Computational results show that the proposed model demonstrates promising performance even though the training data are highly skewed. From the 304 799 PPIs that are curated in several databases, the proposed method detects 23 131 indirect interactions, most of which have been verified by the breadth-first graph search algorithm to find dozens of physical paths between the interacting partners. Pathway enrichment analysis shows that most of the physical paths can be mapped onto more than one human signaling pathway, indicating that there do exist a series of biochemical signals between the two indirectly interacting genes. The interactome-scale computational results promise to provide useful cues to the following applications: (1) exploration of unknown physical PPIs or physical paths between two indirectly interacting genes; (2) amending or extending the existing signaling pathways; (3) recognition of the physical PPIs for druggable target discovery.
识别间接相互作用有助于在计算机上重建信号通路,并为探索两个间接相互作用基因之间未知的物理路径提供线索。然而,到目前为止,非常有限的计算方法明确利用了带有实验证据的间接相互作用。在这项工作中,我们试图通过基于实验数据构建的预测性 l 正则化逻辑回归模型,在人类功能性蛋白质 - 蛋白质相互作用(PPI)网络中区分直接相互作用和间接相互作用。采用 l 正则化逻辑回归方法来抵消潜在的同源噪声,并降低对大型训练数据的计算复杂度。计算结果表明,即使训练数据高度不均衡,所提出的模型仍表现出良好的性能。从多个数据库中整理的 304799 个 PPI 中,该方法检测到 23131 个间接相互作用,其中大部分已通过广度优先图搜索算法验证,以找到相互作用伙伴之间的数十条物理路径。通路富集分析表明,大多数物理路径可以映射到不止一条人类信号通路上,这表明在两个间接相互作用的基因之间确实存在一系列生化信号。全蛋白质组规模的计算结果有望为以下应用提供有用线索:(1)探索两个间接相互作用基因之间未知的物理 PPI 或物理路径;(2)修正或扩展现有的信号通路;(3)识别用于药物靶点发现的物理 PPI。