Suppr超能文献

融合基因表达与传递性蛋白质-蛋白质相互作用以推断基因调控网络。

Fusing gene expressions and transitive protein-protein interactions for inference of gene regulatory networks.

作者信息

Liu Wenting, Rajapakse Jagath C

机构信息

School of Public Health and Management, Hubei University of Medicine, Shiyan, Hubei, China.

Integrative Biology and Physiology, University of California, Los Angeles, Los Angeles, CA, USA.

出版信息

BMC Syst Biol. 2019 Apr 5;13(Suppl 2):37. doi: 10.1186/s12918-019-0695-x.

Abstract

BACKGROUND

Systematic fusion of multiple data sources for Gene Regulatory Networks (GRN) inference remains a key challenge in systems biology. We incorporate information from protein-protein interaction networks (PPIN) into the process of GRN inference from gene expression (GE) data. However, existing PPIN remain sparse and transitive protein interactions can help predict missing protein interactions. We therefore propose a systematic probabilistic framework on fusing GE data and transitive protein interaction data to coherently build GRN.

RESULTS

We use a Gaussian Mixture Model (GMM) to soft-cluster GE data, allowing overlapping cluster memberships. Next, a heuristic method is proposed to extend sparse PPIN by incorporating transitive linkages. We then propose a novel way to score extended protein interactions by combining topological properties of PPIN and correlations of GE. Following this, GE data and extended PPIN are fused using a Gaussian Hidden Markov Model (GHMM) in order to identify gene regulatory pathways and refine interaction scores that are then used to constrain the GRN structure. We employ a Bayesian Gaussian Mixture (BGM) model to refine the GRN derived from GE data by using the structural priors derived from GHMM. Experiments on real yeast regulatory networks demonstrate both the feasibility of the extended PPIN in predicting transitive protein interactions and its effectiveness on improving the coverage and accuracy the proposed method of fusing PPIN and GE to build GRN.

CONCLUSION

The GE and PPIN fusion model outperforms both the state-of-the-art single data source models (CLR, GENIE3, TIGRESS) as well as existing fusion models under various constraints.

摘要

背景

在系统生物学中,为基因调控网络(GRN)推理而对多个数据源进行系统融合仍然是一项关键挑战。我们将来自蛋白质-蛋白质相互作用网络(PPIN)的信息纳入从基因表达(GE)数据进行GRN推理的过程中。然而,现有的PPIN仍然稀疏,并且传递性蛋白质相互作用有助于预测缺失的蛋白质相互作用。因此,我们提出了一个系统的概率框架,用于融合GE数据和传递性蛋白质相互作用数据,以连贯地构建GRN。

结果

我们使用高斯混合模型(GMM)对GE数据进行软聚类,允许重叠的聚类成员身份。接下来,提出了一种启发式方法,通过纳入传递性联系来扩展稀疏的PPIN。然后,我们提出了一种新颖的方法,通过结合PPIN的拓扑特性和GE的相关性来对扩展的蛋白质相互作用进行评分。在此之后,使用高斯隐马尔可夫模型(GHMM)融合GE数据和扩展的PPIN,以识别基因调控途径并细化相互作用分数,然后用于约束GRN结构。我们采用贝叶斯高斯混合(BGM)模型,通过使用从GHMM导出的结构先验来细化从GE数据导出的GRN。对真实酵母调控网络的实验证明了扩展的PPIN在预测传递性蛋白质相互作用方面的可行性,以及其在提高所提出的融合PPIN和GE以构建GRN方法的覆盖率和准确性方面的有效性。

结论

GE和PPIN融合模型在各种约束条件下均优于当前最先进的单数据源模型(CLR、GENIE3、TIGRESS)以及现有的融合模型。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/570d/6449891/6f88ac9ca7c4/12918_2019_695_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验