Suppr
超能文献

脆饼：一种基于 Docker 的大规模表观基因组分析流水线。

Churros: a Docker-based pipeline for large-scale epigenomic analysis.

机构信息

School of Biomedical Sciences, Hunan University, Changsha, Hunan, China.

Institute for Quantitative Biosciences, The University of Tokyo, Bunkyo-ku, Tokyo, Japan.

出版信息

DNA Res. 2024 Feb 1;31(1). doi: 10.1093/dnares/dsad026.

DOI:10.1093/dnares/dsad026

PMID:38102723

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11389749/

Abstract

The epigenome, which reflects the modifications on chromatin or DNA sequences, provides crucial insight into gene expression regulation and cellular activity. With the continuous accumulation of epigenomic datasets such as chromatin immunoprecipitation followed by sequencing (ChIP-seq) data, there is a great demand for a streamlined pipeline to consistently process them, especially for large-dataset comparisons involving hundreds of samples. Here, we present Churros, an end-to-end epigenomic analysis pipeline that is environmentally independent and optimized for handling large-scale data. We successfully demonstrated the effectiveness of Churros by analyzing large-scale ChIP-seq datasets with the hg38 or Telomere-to-Telomere (T2T) human reference genome. We found that applying T2T to the typical analysis workflow has important impacts on read mapping, quality checks, and peak calling. We also introduced a useful feature to study context-specific epigenomic landscapes. Churros will contribute a comprehensive and unified resource for analyzing large-scale epigenomic data.

摘要

表观基因组反映了染色质或 DNA 序列上的修饰，为基因表达调控和细胞活动提供了重要的见解。随着染色质免疫沉淀测序（ChIP-seq）等表观基因组数据集的不断积累，人们对能够持续处理这些数据集的流水线有很大的需求，特别是对于涉及数百个样本的大型数据集比较。在这里，我们提出了 Churros，这是一个端到端的表观基因组分析流水线，它是独立于环境的，并针对处理大规模数据进行了优化。我们成功地通过分析具有 hg38 或端粒到端粒（T2T）人类参考基因组的大规模 ChIP-seq 数据集，证明了 Churros 的有效性。我们发现，将 T2T 应用于典型的分析工作流程对读取映射、质量检查和峰调用有重要影响。我们还引入了一个有用的功能来研究特定于上下文的表观基因组景观。Churros 将为分析大规模表观基因组数据提供一个全面和统一的资源。