Computer Laboratory, University of Cambridge, Cambridge CB3 0FD, UK.
University of Cambridge, Cambridge CB3 0FD, UK and Department of Life Sciences, Imperial College London, London SW7 2AZ, UK.
Bioinformatics. 2016 Apr 15;32(8):1121-9. doi: 10.1093/bioinformatics/btv736. Epub 2015 Dec 17.
Recent advancements in molecular methods have made it possible to capture physical contacts between multiple chromatin fragments. The resulting association matrices provide a noisy estimate for average spatial proximity that can be used to gain insights into the genome organization inside the nucleus. However, extracting topological information from these data is challenging and their integration across resolutions is still poorly addressed. Recent findings suggest that a hierarchical approach could be advantageous for addressing these challenges.
We present an algorithmic framework, which is based on hierarchical block matrices (HBMs), for topological analysis and integration of chromosome conformation capture (3C) data. We first describe chromoHBM, an algorithm that compresses high-throughput 3C (HiT-3C) data into topological features that are efficiently summarized with an HBM representation. We suggest that instead of directly combining HiT-3C datasets across resolutions, which is a difficult task, we can integrate their HBM representations, and describe chromoHBM-3C, an algorithm which merges HBMs. Since three-dimensional (3D) reconstruction can also benefit from topological information, we further present chromoHBM-3D, an algorithm which exploits the HBM representation in order to gradually introduce topological constraints to the reconstruction process. We evaluate our approach in light of previous image microscopy findings and epigenetic data, and show that it can relate multiple spatial scales and provide a more complete view of the 3D genome architecture.
The presented algorithms are available from: https://github.com/yolish/hbm
ys388@cam.ac.uk or pl219@cam.ac.uk
Supplementary data are available at Bioinformatics online.
最近分子方法的进步使得捕捉多个染色质片段之间的物理接触成为可能。由此产生的关联矩阵提供了平均空间接近度的噪声估计,可以用来深入了解核内基因组的组织。然而,从这些数据中提取拓扑信息具有挑战性,并且它们在不同分辨率下的整合仍未得到很好的解决。最近的发现表明,分层方法可能有利于解决这些挑战。
我们提出了一种基于层次块矩阵(HBM)的算法框架,用于拓扑分析和整合染色体构象捕获(3C)数据。我们首先描述了 chromoHBM,这是一种将高通量 3C(HiT-3C)数据压缩为拓扑特征的算法,这些特征可以通过 HBM 表示有效地总结。我们建议,与其直接在分辨率上组合 HiT-3C 数据集,这是一项困难的任务,不如整合它们的 HBM 表示,并描述 chromoHBM-3C,这是一种合并 HBM 的算法。由于三维(3D)重建也可以受益于拓扑信息,我们进一步提出了 chromoHBM-3D,这是一种利用 HBM 表示的算法,以便在重建过程中逐渐引入拓扑约束。我们根据之前的图像显微镜发现和表观遗传数据评估了我们的方法,并表明它可以关联多个空间尺度,并提供 3D 基因组结构的更完整视图。
所提出的算法可从以下网址获得:https://github.com/yolish/hbm
ys388@cam.ac.uk 或 pl219@cam.ac.uk
补充数据可在 Bioinformatics 在线获得。