Cole Simon, Kuksa Pavel P, Cifello Jeffrey, Valladares Otto, Leung Yuk Yee, Wang Li-San
Penn Neurodegeneration Genomics Center, Department of Pathology and Laboratory Medicine, University of Pennsylvania.
Embry-Riddle Aeronautical University.
bioRxiv. 2024 Nov 26:2024.11.25.625258. doi: 10.1101/2024.11.25.625258.
Chromatin conformation capture experiments (CCC), such as Hi-C and Capture Hi-C (CHiC) work to elucidate the three-dimensional organization of the genome and the underlying epigenetic regulatory structures within. CCC experiments produce large amounts of FASTQ sequencing data with a substantial amount of technical noise and require sophisticated computational pipelines in order to extract meaningful results. Large-scale CCC data repositories like 4D Nucleome and ENCODE mostly provide raw contact information but lack annotated, statistically significant interaction data suitable for downstream genetic and genomic analyses.
Here, we present CHARMER, an end-to-end pipeline integrated across multiple CCC assay types (HiC, CHiC) which generates statistically significant, harmonized, queryable, chromatin interactions in a consistent BED-like format across cell/tissue types and CCC assays.
CHARMER is freely available at https://bitbucket.org/wanglab-upenn/CHARMER and harmonized chromatin interaction data will be available in the upcoming version of the FILER database (https://lisanwanglab.org/FILER).
染色质构象捕获实验(CCC),如Hi-C和捕获Hi-C(CHiC),致力于阐明基因组的三维组织以及其中潜在的表观遗传调控结构。CCC实验会产生大量带有大量技术噪声的FASTQ测序数据,并且需要复杂的计算流程才能提取有意义的结果。像4D核体和ENCODE这样的大规模CCC数据存储库大多只提供原始的接触信息,但缺乏适合下游遗传和基因组分析的注释化、具有统计学意义的相互作用数据。
在此,我们展示了CHARMER,这是一个跨多种CCC检测类型(HiC、CHiC)整合的端到端流程,它能在不同细胞/组织类型和CCC检测中,以一致的类似BED格式生成具有统计学意义、经过协调、可查询的染色质相互作用。