Curriculum in Bioinformatics and Computational Biology, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, United States.
Thurston Arthritis Research Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, United States.
Bioinformatics. 2024 Jun 3;40(6). doi: 10.1093/bioinformatics/btae352.
3D chromatin structure plays an important role in regulating gene expression and alterations to this structure can result in developmental abnormalities and disease. While genomic approaches like Hi-C and Micro-C can provide valuable insights in 3D chromatin architecture, the resulting datasets are extremely large and difficult to manipulate.
Here, we present mariner, a rapid and memory efficient tool to extract, aggregate, and plot data from Hi-C matrices within the R/Bioconductor environment. Mariner simplifies the process of querying and extracting contacts from multiple Hi-C files using a parallel and block-processing approach. Modular functions allow complete workflow customization for advanced users, yet all-in-one functions are available for running the most common types of analyses. Finally, tight integration with existing Bioconductor infrastructure enables complete analysis and visualization of Hi-C data in R.
Available on GitHub at https://github.com/EricSDavis/mariner and on Bioconductor at https://www.bioconductor.org/packages/release/bioc/html/mariner.html.
三维染色质结构在调节基因表达中起着重要作用,这种结构的改变可能导致发育异常和疾病。虽然 Hi-C 和 Micro-C 等基因组方法可以提供有关三维染色质结构的有价值的见解,但产生的数据集非常庞大且难以处理。
在这里,我们展示了 mariner,这是一种在 R/Bioconductor 环境中从 Hi-C 矩阵中提取、聚合和绘制数据的快速且内存高效的工具。Mariner 通过使用并行和块处理方法简化了从多个 Hi-C 文件查询和提取接触的过程。模块化功能允许高级用户完全自定义工作流程,但也提供了用于运行最常见类型分析的一体式功能。最后,与现有 Bioconductor 基础设施的紧密集成可在 R 中实现 Hi-C 数据的完整分析和可视化。
可在 GitHub 上的 https://github.com/EricSDavis/mariner 和 Bioconductor 上的 https://www.bioconductor.org/packages/release/bioc/html/mariner.html 获得。