Lainscsek Xenia, Taher Leila
Institute of Biomedical Informatics, Graz University of Technology, Graz, Austria.
NAR Genom Bioinform. 2024 Jul 2;6(3):lqae076. doi: 10.1093/nargab/lqae076. eCollection 2024 Sep.
Hi-C and micro-C sequencing have shed light on the profound importance of 3D genome organization in cellular function by probing 3D contact frequencies across the linear genome. The resulting contact matrices are extremely sparse and susceptible to technical- and sequence-based biases, making their comparison challenging. The development of reliable, robust and efficient methods for quantifying similarity between contact matrices is crucial for investigating variations in the 3D genome organization in different cell types or under different conditions, as well as evaluating experimental reproducibility. We present a novel method, ENT3C, which measures the change in pattern complexity in the vicinity of contact matrix diagonals to quantify their similarity. ENT3C provides a robust, user-friendly Hi-C or micro-C contact matrix similarity metric and a characteristic entropy signal that can be used to gain detailed biological insights into 3D genome organization.
Hi-C和微量C测序通过探测线性基因组上的三维接触频率,揭示了三维基因组组织在细胞功能中的深远重要性。由此产生的接触矩阵极其稀疏,并且容易受到基于技术和序列的偏差影响,这使得它们之间的比较具有挑战性。开发可靠、稳健且高效的方法来量化接触矩阵之间的相似性,对于研究不同细胞类型或不同条件下三维基因组组织的变化,以及评估实验可重复性至关重要。我们提出了一种新方法ENT3C,它通过测量接触矩阵对角线附近模式复杂性的变化来量化它们的相似性。ENT3C提供了一种稳健、用户友好的Hi-C或微量C接触矩阵相似性度量,以及一种特征熵信号,可用于深入了解三维基因组组织的详细生物学信息。