Department of Computer Science, City University of Hong Kong, 83 Tat Chee Ave, Kowloon Tong, Hong Kong, China.
Genome Biol. 2021 Jan 25;22(1):45. doi: 10.1186/s13059-020-02234-6.
Topologically associating domains (TADs) are the organizational units of chromosome structures. TADs can contain TADs, thus forming a hierarchy. TAD hierarchies can be inferred from Hi-C data through coding trees. However, the current method for computing coding trees is not optimal. In this paper, we propose optimal algorithms for this computation. In comparison with seven state-of-art methods using two public datasets, from GM12878 and IMR90 cells, SuperTAD shows a significant enrichment of structural proteins around detected boundaries and histone modifications within TADs and displays a high consistency between various resolutions of identical Hi-C matrices.
拓扑关联域(TAD)是染色体结构的组织单元。TAD 可以包含 TAD,从而形成层次结构。TAD 层次结构可以通过编码树从 Hi-C 数据中推断出来。然而,当前计算编码树的方法不是最优的。在本文中,我们为此计算提出了最优算法。通过使用 GM12878 和 IMR90 细胞的两个公共数据集与七种最先进的方法进行比较,SuperTAD 在检测到的边界周围和 TAD 内的结构蛋白和组蛋白修饰方面表现出显著的富集,并且在相同 Hi-C 矩阵的各种分辨率之间显示出高度的一致性。