Suppr超能文献

利用可配置云虚拟机从ChIP-Seq数据中鉴定超级增强子的高性能方法。

High-performance method for identification of super enhancers from ChIP-Seq data with configurable cloud virtual machines.

作者信息

Orlova Natalia N, Bogatova Olga V, Orlov Alexey V

机构信息

Moscow Institute of Physics and Technology (State University), Dolgoprudny, Moscow Region, Russia.

Prokhorov General Physics Institute of the Russian Academy of Sciences, Moscow, Russia.

出版信息

MethodsX. 2020 Nov 28;7:101165. doi: 10.1016/j.mex.2020.101165. eCollection 2020.

Abstract

A universal method for rapid identifying super-enhancers which are large domains of multiple closely-spaced enhancers is proposed. The method applies configurable cloud virtual machines (cVMs) and the rank-ordering of super-enhancers (ROSE) algorithm. To identify super-enhancers a сVM-based analysis of the ChIP-seq binding patterns of the active enhancer-associated mark is employed. The use of the proposed method is described step-by-step: configuration of cVM; ChIP-seq data alignment; peak calling; ROSE algorithm; interpretation of the results on a client machine. The method was validated for the search of super-enhancers using the H3K27ac mark in the sample datasets of a cell line (human MCF-7), mouse tissue (heart), and human tissue (adrenal gland). The total analysis cycle time of raw ChIP-seq data ranges from 15 to 48 min, depending on the number of initial short reads. Depending on the data processing step and availability of multi-threading, a cVM can be scaled up to a multi-CPU configuration with large amount of RAM. An important feature of the method is that it can run on a client machine that has low-performance with virtually any OS. The proposed method allows for simultaneous and independent processing of different sample datasets on multiple clones of a single cVM.•Cloud VMs were used for rapid processing of ChIP-seq data to identify super-enhancers.•The method can use a low-performance computer with virtually any OS on it.•It can be scaled up for parallel processing of individual sample datasets on their own VMs for rapid high-throughput processing.

摘要

本文提出了一种快速识别超级增强子的通用方法,超级增强子是多个紧密间隔增强子的大区域。该方法应用了可配置的云虚拟机(cVM)和超级增强子排序(ROSE)算法。为了识别超级增强子,采用基于cVM的活性增强子相关标记的ChIP-seq结合模式分析。文中逐步描述了该方法的使用过程:cVM配置;ChIP-seq数据比对;峰检测;ROSE算法;在客户端机器上解释结果。该方法在细胞系(人MCF-7)、小鼠组织(心脏)和人体组织(肾上腺)的样本数据集中,使用H3K27ac标记对超级增强子搜索进行了验证。根据初始短读段数量,原始ChIP-seq数据的总分析周期时间为15至48分钟。根据数据处理步骤和多线程可用性,cVM可以扩展为具有大量RAM的多CPU配置。该方法的一个重要特点是它可以在几乎任何操作系统的低性能客户端机器上运行。所提出的方法允许在单个cVM的多个克隆上同时独立处理不同的样本数据集。

• 使用云虚拟机快速处理ChIP-seq数据以识别超级增强子。

• 该方法可以在几乎任何操作系统的低性能计算机上使用。

• 它可以扩展为在其自己的虚拟机上并行处理单个样本数据集,以进行快速高通量处理。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/60ef/7897706/168680eaaf12/fx1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验