CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Barcelona 08028, Spain.
Gastrointestinal and Endocrine Tumors Group, Vall d'Hebron Institute of Oncology (VHIO), Barcelona 08035, Spain.
Nucleic Acids Res. 2020 Apr 17;48(7):e39. doi: 10.1093/nar/gkaa087.
The rapid development of Chromosome Conformation Capture (3C-based techniques), as well as imaging together with bioinformatics analyses, has been fundamental for unveiling that chromosomes are organized into the so-called topologically associating domains or TADs. While TADs appear as nested patterns in the 3C-based interaction matrices, the vast majority of available TAD callers are based on the hypothesis that TADs are individual and unrelated chromatin structures. Here we introduce TADpole, a computational tool designed to identify and analyze the entire hierarchy of TADs in intra-chromosomal interaction matrices. TADpole combines principal component analysis and constrained hierarchical clustering to provide a set of significant hierarchical chromatin levels in a genomic region of interest. TADpole is robust to data resolution, normalization strategy and sequencing depth. Domain borders defined by TADpole are enriched in main architectural proteins (CTCF and cohesin complex subunits) and in the histone mark H3K4me3, while their domain bodies, depending on their activation-state, are enriched in either H3K36me3 or H3K27me3, highlighting that TADpole is able to distinguish functional TAD units. Additionally, we demonstrate that TADpole's hierarchical annotation, together with the new DiffT score, allows for detecting significant topological differences on Capture Hi-C maps between wild-type and genetically engineered mouse.
染色体构象捕获(3C 技术)的快速发展,以及成像与生物信息学分析,对于揭示染色体组织成所谓的拓扑关联域或 TADs 至关重要。虽然 TADs 在基于 3C 的相互作用矩阵中呈现嵌套模式,但大多数可用的 TAD 调用器基于 TADs 是单个且不相关的染色质结构的假设。在这里,我们引入了 TADpole,这是一种设计用于识别和分析染色体内相互作用矩阵中整个 TAD 层次结构的计算工具。TADpole 结合主成分分析和约束层次聚类,为感兴趣的基因组区域提供一组重要的层次染色质水平。TADpole 对数据分辨率、归一化策略和测序深度具有鲁棒性。TADpole 定义的域边界富含主要结构蛋白(CTCF 和黏合复合物亚基)和组蛋白标记 H3K4me3,而它们的域体,根据其激活状态,富含 H3K36me3 或 H3K27me3,这突出表明 TADpole 能够区分功能 TAD 单元。此外,我们证明 TADpole 的层次注释,以及新的 DiffT 评分,允许在野生型和基因工程小鼠之间的 Capture Hi-C 图谱上检测到显著的拓扑差异。