Friedrich Miescher Institut for Biomedical Research, Maulbeerstrasse 66, 4058 Basel, Switzerland.
Bioinformatics Group, Department of Computer Science, University of Freiburg, Georges-Köhler-Allee 106, 79110 Freiburg, Germany.
Gigascience. 2022 Jul 9;11. doi: 10.1093/gigascience/giac061.
Chromatin loops are an essential factor in the structural organization of the genome; however, their detection in Hi-C interaction matrices is a challenging and compute-intensive task. The approach presented here, integrated into the HiCExplorer software, shows a chromatin loop detection algorithm that applies a strict candidate selection based on continuous negative binomial distributions and performs a Wilcoxon rank-sum test to detect enriched Hi-C interactions.
HiCExplorer's loop detection has a high detection rate and accuracy. It is the fastest available CPU implementation and utilizes all threads offered by modern multicore platforms.
HiCExplorer's method to detect loops by using a continuous negative binomial function combined with the donut approach from HiCCUPS leads to reliable and fast computation of loops. All the loop-calling algorithms investigated provide differing results, which intersect by $\sim 50%$ at most. The tested in situ Hi-C data contain a large amount of noise; achieving better agreement between loop calling algorithms will require cleaner Hi-C data and therefore future improvements to the experimental methods that generate the data.
染色质环是基因组结构组织的一个重要因素;然而,在 Hi-C 相互作用矩阵中检测它们是一项具有挑战性且计算密集型的任务。本研究提出了一种集成在 HiCExplorer 软件中的染色质环检测算法,该算法应用严格的基于连续负二项分布的候选选择,并进行 Wilcoxon 秩和检验以检测富集的 Hi-C 相互作用。
HiCExplorer 的环检测具有较高的检测率和准确性。它是最快的可用 CPU 实现,利用了现代多核平台提供的所有线程。
HiCExplorer 使用连续负二项函数结合 HiCCUPS 的甜甜圈方法来检测环的方法,可实现环的可靠和快速计算。所有研究的环调用算法都提供了不同的结果,最多只有 $\sim 50%$的重叠。测试的原位 Hi-C 数据包含大量噪声;要使环调用算法之间的一致性更好,需要更清洁的 Hi-C 数据,因此未来需要改进生成数据的实验方法。