State Key Laboratory for Conservation and Utilization of Bio-resource, School of Ecology and Environment, School of Life Sciences and School of Medicine, Yunnan University, Kunming, Yunnan, China.
Department of Biology, University of Waterloo, Waterloo, ON, Canada.
PLoS One. 2024 Jul 15;19(7):e0298564. doi: 10.1371/journal.pone.0298564. eCollection 2024.
High-quality, chromosome-scale genomes are essential for genomic analyses. Analyses, including 3D genomics, epigenetics, and comparative genomics rely on a high-quality genome assembly, which is often accomplished with the assistance of Hi-C data. Curation of genomes reveal that current Hi-C-assisted scaffolding algorithms either generate ordering and orientation errors or fail to assemble high-quality chromosome-level scaffolds. Here, we offer the software Puzzle Hi-C, which uses Hi-C reads to accurately assign contigs or scaffolds to chromosomes. Puzzle Hi-C uses the triangle region instead of the square region to count interactions in a Hi-C heatmap. This strategy dramatically diminishes scaffolding interference caused by long-range interactions. This software also introduces a dynamic, triangle window strategy during assembly. Initially small, the window expands with interactions to produce more effective clustering. Puzzle Hi-C outperforms available scaffolding tools.
高质量的染色体级别的基因组对于基因组分析是必不可少的。分析,包括 3D 基因组学、表观基因组学和比较基因组学,都依赖于高质量的基因组组装,而这通常需要 Hi-C 数据的辅助。基因组的校对表明,当前的 Hi-C 辅助支架算法要么会产生排序和方向错误,要么无法组装出高质量的染色体水平支架。在这里,我们提供了拼图 Hi-C 软件,它使用 Hi-C 读取来准确地将 contigs 或支架分配到染色体上。拼图 Hi-C 使用三角形区域而不是正方形区域来计算 Hi-C 热图中的相互作用。这种策略显著减少了长距离相互作用引起的支架干扰。该软件还在组装过程中引入了一个动态的三角形窗口策略。该窗口最初较小,但随着相互作用的增加而扩展,从而产生更有效的聚类。拼图 Hi-C 优于现有的支架工具。