Paulsen Jonas, Rødland Einar A, Holden Lars, Holden Marit, Hovig Eivind
Institute for Cancer Genetics and Informatics, Oslo University Hospital, PO Box 4950, Nydalen, N-0424 Oslo, Norway
Department of Tumor Biology, Institute for Cancer Research, The Norwegian Radium Hospital, Oslo University Hospital, PO Box 4950, Nydalen, N-0424 Oslo, Norway.
Nucleic Acids Res. 2014 Oct;42(18):e143. doi: 10.1093/nar/gku738. Epub 2014 Aug 11.
Identification of three-dimensional (3D) interactions between regulatory elements across the genome is crucial to unravel the complex regulatory machinery that orchestrates proliferation and differentiation of cells. ChIA-PET is a novel method to identify such interactions, where physical contacts between regions bound by a specific protein are quantified using next-generation sequencing. However, determining the significance of the observed interaction frequencies in such datasets is challenging, and few methods have been proposed. Despite the fact that regions that are close in linear genomic distance have a much higher tendency to interact by chance, no methods to date are capable of taking such dependency into account. Here, we propose a statistical model taking into account the genomic distance relationship, as well as the general propensity of anchors to be involved in contacts overall. Using both real and simulated data, we show that the previously proposed statistical test, based on Fisher's exact test, leads to invalid results when data are dependent on genomic distance. We also evaluate our method on previously validated cell-line specific and constitutive 3D interactions, and show that relevant interactions are significant, while avoiding over-estimating the significance of short nearby interactions.
识别全基因组调控元件之间的三维(3D)相互作用对于揭示协调细胞增殖和分化的复杂调控机制至关重要。ChIA-PET是一种识别此类相互作用的新方法,它利用下一代测序技术对由特定蛋白质结合的区域之间的物理接触进行定量。然而,确定此类数据集中观察到的相互作用频率的显著性具有挑战性,并且很少有方法被提出。尽管线性基因组距离相近的区域偶然发生相互作用的倾向要高得多,但迄今为止尚无方法能够考虑到这种依赖性。在此,我们提出一种统计模型,该模型考虑了基因组距离关系以及锚定物总体上参与接触的一般倾向。使用真实数据和模拟数据,我们表明先前基于Fisher精确检验提出的统计检验在数据依赖于基因组距离时会导致无效结果。我们还在先前验证的细胞系特异性和组成型3D相互作用上评估了我们的方法,结果表明相关相互作用具有显著性,同时避免了高估附近短距离相互作用的显著性。