University of Nebraska Medical Center, Omaha, NE 68105 USA.
Deloitte Consulting LLP, Health Data and AI Arlington, VA, USA.
Brief Bioinform. 2024 Jul 23;25(Supplement_1). doi: 10.1093/bib/bbae090.
Assay for transposase-accessible chromatin with high-throughput sequencing (ATAC-seq) generates genome-wide chromatin accessibility profiles, providing valuable insights into epigenetic gene regulation at both pooled-cell and single-cell population levels. Comprehensive analysis of ATAC-seq data involves the use of various interdependent programs. Learning the correct sequence of steps needed to process the data can represent a major hurdle. Selecting appropriate parameters at each stage, including pre-analysis, core analysis, and advanced downstream analysis, is important to ensure accurate analysis and interpretation of ATAC-seq data. Additionally, obtaining and working within a limited computational environment presents a significant challenge to non-bioinformatic researchers. Therefore, we present Cloud ATAC, an open-source, cloud-based interactive framework with a scalable, flexible, and streamlined analysis framework based on the best practices approach for pooled-cell and single-cell ATAC-seq data. These frameworks use on-demand computational power and memory, scalability, and a secure and compliant environment provided by the Google Cloud. Additionally, we leverage Jupyter Notebook's interactive computing platform that combines live code, tutorials, narrative text, flashcards, quizzes, and custom visualizations to enhance learning and analysis. Further, leveraging GPU instances has significantly improved the run-time of the single-cell framework. The source codes and data are publicly available through NIH Cloud lab https://github.com/NIGMS/ATAC-Seq-and-Single-Cell-ATAC-Seq-Analysis. This manuscript describes the development of a resource module that is part of a learning platform named ``NIGMS Sandbox for Cloud-based Learning'' https://github.com/NIGMS/NIGMS-Sandbox. The overall genesis of the Sandbox is described in the editorial NIGMS Sandbox [1] at the beginning of this Supplement. This module delivers learning materials on the analysis of bulk and single-cell ATAC-seq data in an interactive format that uses appropriate cloud resources for data access and analyses.
利用高通量测序进行转座酶可及染色质分析(ATAC-seq)可生成全基因组染色质可及性图谱,为 pooled-cell 和单细胞群体水平的表观基因调控提供有价值的见解。全面分析 ATAC-seq 数据涉及使用各种相互依赖的程序。学习处理数据所需的正确步骤序列可能是一个主要障碍。在每个阶段选择适当的参数,包括预分析、核心分析和高级下游分析,对于确保 ATAC-seq 数据的准确分析和解释非常重要。此外,对于非生物信息学研究人员来说,在有限的计算环境中获取和工作也是一个重大挑战。因此,我们提出了 Cloud ATAC,这是一个开源的、基于云的交互式框架,具有基于最佳实践方法的可扩展、灵活和简化的分析框架,适用于 pooled-cell 和单细胞 ATAC-seq 数据。这些框架利用按需计算能力和内存、可扩展性以及由 Google Cloud 提供的安全合规环境。此外,我们利用了 Jupyter Notebook 的交互式计算平台,该平台结合了实时代码、教程、叙述文本、抽认卡、测验和自定义可视化,以增强学习和分析。此外,利用 GPU 实例显著提高了单细胞框架的运行时间。源代码和数据可通过 NIH Cloud lab https://github.com/NIGMS/ATAC-Seq-and-Single-Cell-ATAC-Seq-Analysis 公开获取。本文档描述了开发一个资源模块的过程,该模块是名为“基于云的学习的 NIGMS 沙盒”的学习平台的一部分,网址为 https://github.com/NIGMS/NIGMS-Sandbox。沙盒的总体起源在本增刊开头的社论 NIGMS Sandbox [1] 中进行了描述。该模块以交互式格式提供有关 bulk 和单细胞 ATAC-seq 数据分析的学习材料,该格式使用适当的云资源进行数据访问和分析。