Bradley Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Arlington, VA 22203, USA.
Department of Pathology, Johns Hopkins Medical Institutions, Baltimore, MD 21231, USA.
Bioinformatics. 2018 May 15;34(10):1733-1740. doi: 10.1093/bioinformatics/btx827.
NGS techniques have been widely applied in genetic and epigenetic studies. Multiple ChIP-seq and RNA-seq profiles can now be jointly used to infer functional regulatory networks (FRNs). However, existing methods suffer from either oversimplified assumption on transcription factor (TF) regulation or slow convergence of sampling for FRN inference from large-scale ChIP-seq and time-course RNA-seq data.
We developed an efficient Bayesian integration method (CRNET) for FRN inference using a two-stage Gibbs sampler to estimate iteratively hidden TF activities and the posterior probabilities of binding events. A novel statistic measure that jointly considers regulation strength and regression error enables the sampling process of CRNET to converge quickly, thus making CRNET very efficient for large-scale FRN inference. Experiments on synthetic and benchmark data showed a significantly improved performance of CRNET when compared with existing methods. CRNET was applied to breast cancer data to identify FRNs functional at promoter or enhancer regions in breast cancer MCF-7 cells. Transcription factor MYC is predicted as a key functional factor in both promoter and enhancer FRNs. We experimentally validated the regulation effects of MYC on CRNET-predicted target genes using appropriate RNAi approaches in MCF-7 cells.
R scripts of CRNET are available at http://www.cbil.ece.vt.edu/software.htm.
Supplementary data are available at Bioinformatics online.
NGS 技术已广泛应用于遗传和表观遗传学研究。现在可以联合使用多个 ChIP-seq 和 RNA-seq 谱来推断功能调节网络(FRN)。然而,现有的方法要么对转录因子(TF)调节的假设过于简单化,要么在从大规模 ChIP-seq 和时程 RNA-seq 数据推断 FRN 时,采样的收敛速度较慢。
我们开发了一种有效的贝叶斯整合方法(CRNET),用于 FRN 推断,使用两阶段 Gibbs 采样器来迭代估计隐藏的 TF 活性和绑定事件的后验概率。一种新的统计度量标准,同时考虑调节强度和回归误差,使 CRNET 的采样过程能够快速收敛,从而使 CRNET 非常高效地进行大规模 FRN 推断。在合成和基准数据上的实验表明,与现有方法相比,CRNET 的性能有了显著提高。CRNET 被应用于乳腺癌数据,以识别在乳腺癌 MCF-7 细胞中启动子或增强子区域起作用的 FRN。预测转录因子 MYC 是启动子和增强子 FRN 中的关键功能因子。我们使用 MCF-7 细胞中适当的 RNAi 方法,实验验证了 CRNET 预测的靶基因上 MYC 的调节作用。
CRNET 的 R 脚本可在 http://www.cbil.ece.vt.edu/software.htm 获得。
补充数据可在 Bioinformatics 在线获得。