Weinreb Caleb, Raphael Benjamin J
Center for Computational Molecular Biology and.
Center for Computational Molecular Biology and Department of Computer Science, Brown University, Providence, RI 02912, USA.
Bioinformatics. 2016 Jun 1;32(11):1601-9. doi: 10.1093/bioinformatics/btv485. Epub 2015 Aug 26.
The three-dimensional structure of the genome is an important regulator of many cellular processes including differentiation and gene regulation. Recently, technologies such as Hi-C that combine proximity ligation with high-throughput sequencing have revealed domains of self-interacting chromatin, called topologically associating domains (TADs), in many organisms. Current methods for identifying TADs using Hi-C data assume that TADs are non-overlapping, despite evidence for a nested structure in which TADs and sub-TADs form a complex hierarchy.
We introduce a model for decomposition of contact frequencies into a hierarchy of nested TADs. This model is based on empirical distributions of contact frequencies within TADs, where positions that are far apart have a greater enrichment of contacts than positions that are close together. We find that the increase in contact enrichment with distance is stronger for the inner TAD than for the outer TAD in a TAD/sub-TAD pair. Using this model, we develop the TADtree algorithm for detecting hierarchies of nested TADs. TADtree compares favorably with previous methods, finding TADs with a greater enrichment of chromatin marks such as CTCF at their boundaries.
A python implementation of TADtree is available at http://compbio.cs.brown.edu/software/
Supplementary data are available at Bioinformatics online.
基因组的三维结构是包括分化和基因调控在内的许多细胞过程的重要调节因子。最近,诸如Hi-C等将邻近连接与高通量测序相结合的技术,在许多生物体中揭示了自相互作用染色质的结构域,称为拓扑相关结构域(TADs)。目前使用Hi-C数据识别TADs的方法假定TADs是非重叠的,尽管有证据表明存在一种嵌套结构,其中TADs和亚TADs形成了复杂的层次结构。
我们引入了一个将接触频率分解为嵌套TADs层次结构的模型。该模型基于TADs内接触频率的经验分布,其中距离较远的位置比距离较近的位置具有更高的接触富集度。我们发现,在TAD/亚TAD对中,内部TAD的接触富集度随距离的增加比外部TAD更强。使用该模型,我们开发了用于检测嵌套TADs层次结构的TADtree算法。TADtree与以前的方法相比具有优势,能够找到在其边界处具有更高染色质标记(如CTCF)富集度的TADs。
TADtree的Python实现可在http://compbio.cs.brown.edu/software/获取。
补充数据可在《生物信息学》在线获取。