Department of Bioscience and Nutrition, Karolinska Institute, Blickagången 16, Huddinge SE-141 83, Sweden.
Department of Bioscience and Nutrition, Karolinska Institute, Blickagången 16, Huddinge SE-141 83, Sweden.
Genomics. 2024 Jul;116(4):110858. doi: 10.1016/j.ygeno.2024.110858. Epub 2024 May 11.
The ever decreasing cost of Next-Generation Sequencing coupled with the emergence of efficient and reproducible analysis pipelines has rendered genomic methods more accessible. However, downstream analyses are basic or missing in most workflows, creating a significant barrier for non-bioinformaticians. To help close this gap, we developed Cactus, an end-to-end pipeline for analyzing ATAC-Seq and mRNA-Seq data, either separately or jointly. Its Nextflow-, container-, and virtual environment-based architecture ensures efficient and reproducible analyses. Cactus preprocesses raw reads, conducts differential analyses between conditions, and performs enrichment analyses in various databases, including DNA-binding motifs, ChIP-Seq binding sites, chromatin states, and ontologies. We demonstrate the utility of Cactus in a multi-modal and multi-species case study as well as by showcasing its unique capabilities as compared to other ATAC-Seq pipelines. In conclusion, Cactus can assist researchers in gaining comprehensive insights from chromatin accessibility and gene expression data in a quick, user-friendly, and reproducible manner.
随着高通量测序成本的不断降低和高效可重复分析流程的出现,基因组学方法变得更加容易获取。然而,在大多数工作流程中,下游分析要么基础薄弱,要么根本不存在,这对非生物信息学家来说是一个巨大的障碍。为了弥补这一差距,我们开发了 Cactus,这是一个用于分析 ATAC-Seq 和 mRNA-Seq 数据的端到端管道,可以分别或联合使用。它基于 Nextflow、容器和虚拟环境的架构确保了高效和可重复的分析。Cactus 预处理原始读取,在不同条件下进行差异分析,并在各种数据库中进行富集分析,包括 DNA 结合基序、ChIP-Seq 结合位点、染色质状态和本体。我们通过一个多模态和多物种的案例研究展示了 Cactus 的实用性,并与其他 ATAC-Seq 管道进行了比较,展示了其独特的功能。总之,Cactus 可以帮助研究人员以快速、用户友好和可重复的方式从染色质可及性和基因表达数据中获得全面的见解。