Cancer Research UK Cambridge Research Institute, Cambridge, Cambridgeshire, United Kingdom.
PLoS Comput Biol. 2012;8(6):e1002566. doi: 10.1371/journal.pcbi.1002566. Epub 2012 Jun 28.
Combinatorial gene perturbations provide rich information for a systematic exploration of genetic interactions. Despite successful applications to bacteria and yeast, the scalability of this approach remains a major challenge for higher organisms such as humans. Here, we report a novel experimental and computational framework to efficiently address this challenge by limiting the 'search space' for important genetic interactions. We propose to integrate rich phenotypes of multiple single gene perturbations to robustly predict functional modules, which can subsequently be subjected to further experimental investigations such as combinatorial gene silencing. We present posterior association networks (PANs) to predict functional interactions between genes estimated using a Bayesian mixture modelling approach. The major advantage of this approach over conventional hypothesis tests is that prior knowledge can be incorporated to enhance predictive power. We demonstrate in a simulation study and on biological data, that integrating complementary information greatly improves prediction accuracy. To search for significant modules, we perform hierarchical clustering with multiscale bootstrap resampling. We demonstrate the power of the proposed methodologies in applications to Ewing's sarcoma and human adult stem cells using publicly available and custom generated data, respectively. In the former application, we identify a gene module including many confirmed and highly promising therapeutic targets. Genes in the module are also significantly overrepresented in signalling pathways that are known to be critical for proliferation of Ewing's sarcoma cells. In the latter application, we predict a functional network of chromatin factors controlling epidermal stem cell fate. Further examinations using ChIP-seq, ChIP-qPCR and RT-qPCR reveal that the basis of their genetic interactions may arise from transcriptional cross regulation. A Bioconductor package implementing PAN is freely available online at http://bioconductor.org/packages/release/bioc/html/PANR.html.
组合基因扰动为系统探索遗传相互作用提供了丰富的信息。尽管这种方法在细菌和酵母中取得了成功应用,但对于人类等高等生物来说,其可扩展性仍然是一个主要挑战。在这里,我们报告了一种新的实验和计算框架,通过限制重要遗传相互作用的“搜索空间”,来有效地解决这一挑战。我们建议将多个单基因扰动的丰富表型整合起来,以稳健地预测功能模块,随后可以对这些模块进行进一步的实验研究,如组合基因沉默。我们提出了后验关联网络(PAN)来预测使用贝叶斯混合模型方法估计的基因之间的功能相互作用。与传统假设检验相比,这种方法的主要优势在于可以整合先验知识来提高预测能力。我们在模拟研究和生物数据上表明,整合互补信息可以大大提高预测准确性。为了搜索显著模块,我们使用多尺度 bootstrap 重采样进行层次聚类。我们分别使用公开可用数据和定制生成的数据来展示这些方法在尤文肉瘤和人类成体干细胞中的应用的强大功能。在前一种应用中,我们确定了一个包含许多已确认和极具前景的治疗靶点的基因模块。该模块中的基因也在已知对尤文肉瘤细胞增殖至关重要的信号通路中显著过表达。在后一种应用中,我们预测了控制表皮干细胞命运的染色质因子的功能网络。使用 ChIP-seq、ChIP-qPCR 和 RT-qPCR 的进一步检查表明,它们的遗传相互作用的基础可能来自于转录交叉调节。一个实现 PAN 的 Bioconductor 包可在 http://bioconductor.org/packages/release/bioc/html/PANR.html 上免费获得。