Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, 53715, USA.
Wisconsin Institute for Discovery, 330 N. Orchard Street, Madison, 53715, USA.
Genome Biol. 2021 May 25;22(1):164. doi: 10.1186/s13059-021-02378-z.
High-throughput chromosome conformation capture assays, such as Hi-C, have shown that the genome is organized into organizational units such as topologically associating domains (TADs), which can impact gene regulatory processes. The sparsity of Hi-C matrices poses a challenge for reliable detection of these units. We present GRiNCH, a constrained matrix-factorization-based approach for simultaneous smoothing and discovery of TADs from sparse contact count matrices. GRiNCH shows superior performance against seven TAD-calling methods and three smoothing methods. GRiNCH is applicable to multiple platforms including SPRITE and HiChIP and can predict novel boundary factors with potential roles in genome organization.
高通量染色体构象捕获技术,如 Hi-C,表明基因组组织成拓扑关联域(TADs)等结构单元,这些结构单元可能影响基因调控过程。Hi-C 矩阵的稀疏性给这些结构单元的可靠检测带来了挑战。我们提出了 GRiNCH,这是一种基于约束矩阵分解的方法,用于从稀疏的接触计数矩阵中同时平滑和发现 TADs。GRiNCH 在性能上优于七种 TAD 调用方法和三种平滑方法。GRiNCH 适用于包括 SPRITE 和 HiChIP 在内的多个平台,并且可以预测具有潜在基因组组织作用的新边界因子。