McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore, MD 21205, USA.
Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltmore, MD 21205, USA.
Brief Bioinform. 2024 Mar 27;25(3). doi: 10.1093/bib/bbae217.
Hi-C data are commonly normalized using single sample processing methods, with focus on comparisons between regions within a given contact map. Here, we aim to compare contact maps across different samples. We demonstrate that unwanted variation, of likely technical origin, is present in Hi-C data with replicates from different individuals, and that properties of this unwanted variation change across the contact map. We present band-wise normalization and batch correction, a method for normalization and batch correction of Hi-C data and show that it substantially improves comparisons across samples, including in a quantitative trait loci analysis as well as differential enrichment across cell types.
Hi-C 数据通常使用单一样本处理方法进行标准化,重点是比较给定接触图谱内的区域。在这里,我们旨在比较不同样本的接触图谱。我们证明,来自不同个体的重复 Hi-C 数据中存在可能源于技术的非期望变异,并且这种非期望变异的性质在整个接触图谱中发生变化。我们提出了基于带的标准化和批次校正,这是一种用于 Hi-C 数据标准化和批次校正的方法,并表明它可以大大改善样本间的比较,包括在数量性状基因座分析以及跨细胞类型的差异富集中。