• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于全面单组ChIP-seq分析的可重复使用、可扩展且可修改的R脚本和开普勒工作流程。

Reusable, extensible, and modifiable R scripts and Kepler workflows for comprehensive single set ChIP-seq analysis.

作者信息

Cormier Nathan, Kolisnik Tyler, Bieda Mark

机构信息

Department of Biochemistry and Molecular Biology, University of Calgary Cumming School of Medicine, Rm HSC1151, 3330 Hospital Dr. NW, Calgary, AB, T2N4N1, Canada.

出版信息

BMC Bioinformatics. 2016 Jul 5;17(1):270. doi: 10.1186/s12859-016-1125-3.

DOI:10.1186/s12859-016-1125-3
PMID:27377783
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4932705/
Abstract

BACKGROUND

There has been an enormous expansion of use of chromatin immunoprecipitation followed by sequencing (ChIP-seq) technologies. Analysis of large-scale ChIP-seq datasets involves a complex series of steps and production of several specialized graphical outputs. A number of systems have emphasized custom development of ChIP-seq pipelines. These systems are primarily based on custom programming of a single, complex pipeline or supply libraries of modules and do not produce the full range of outputs commonly produced for ChIP-seq datasets. It is desirable to have more comprehensive pipelines, in particular ones addressing common metadata tasks, such as pathway analysis, and pipelines producing standard complex graphical outputs. It is advantageous if these are highly modular systems, available as both turnkey pipelines and individual modules, that are easily comprehensible, modifiable and extensible to allow rapid alteration in response to new analysis developments in this growing area. Furthermore, it is advantageous if these pipelines allow data provenance tracking.

RESULTS

We present a set of 20 ChIP-seq analysis software modules implemented in the Kepler workflow system; most (18/20) were also implemented as standalone, fully functional R scripts. The set consists of four full turnkey pipelines and 16 component modules. The turnkey pipelines in Kepler allow data provenance tracking. Implementation emphasized use of common R packages and widely-used external tools (e.g., MACS for peak finding), along with custom programming. This software presents comprehensive solutions and easily repurposed code blocks for ChIP-seq analysis and pipeline creation. Tasks include mapping raw reads, peakfinding via MACS, summary statistics, peak location statistics, summary plots centered on the transcription start site (TSS), gene ontology, pathway analysis, and de novo motif finding, among others.

CONCLUSIONS

These pipelines range from those performing a single task to those performing full analyses of ChIP-seq data. The pipelines are supplied as both Kepler workflows, which allow data provenance tracking, and, in the majority of cases, as standalone R scripts. These pipelines are designed for ease of modification and repurposing.

摘要

背景

染色质免疫沉淀测序(ChIP-seq)技术的应用范围已大幅扩展。大规模ChIP-seq数据集的分析涉及一系列复杂步骤,并会生成多种专门的图形输出。许多系统都侧重于ChIP-seq流程的定制开发。这些系统主要基于单个复杂流程的定制编程或提供模块库,无法生成ChIP-seq数据集通常会产生的全部输出。需要更全面的流程,特别是那些能处理常见元数据任务(如通路分析)的流程,以及能生成标准复杂图形输出的流程。如果这些是高度模块化的系统,既可以作为交钥匙流程,也可以作为单个模块使用,易于理解、修改和扩展,以便能根据这一不断发展的领域中的新分析进展快速调整,那就更好了。此外,如果这些流程能实现数据溯源跟踪则更具优势。

结果

我们展示了一套在开普勒工作流系统中实现的20个ChIP-seq分析软件模块;其中大多数(18/20)也被实现为独立的、功能齐全的R脚本。该套件包括四个完整的交钥匙流程和16个组件模块。开普勒中的交钥匙流程允许进行数据溯源跟踪。实现过程强调使用常见的R包和广泛使用的外部工具(如用于峰值查找的MACS),以及定制编程。该软件为ChIP-seq分析和流程创建提供了全面的解决方案和易于重新利用的代码块。任务包括原始读段映射、通过MACS进行峰值查找、汇总统计、峰值位置统计、以转录起始位点(TSS)为中心的汇总图、基因本体、通路分析和从头基序查找等。

结论

这些流程涵盖了从执行单一任务到对ChIP-seq数据进行全面分析的各种流程。这些流程既可以作为允许进行数据溯源跟踪的开普勒工作流提供,在大多数情况下也可以作为独立的R脚本提供。这些流程的设计便于修改和重新利用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b5dc/4932705/a62191acd344/12859_2016_1125_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b5dc/4932705/747e676f7157/12859_2016_1125_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b5dc/4932705/c6a4e33bebc4/12859_2016_1125_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b5dc/4932705/a119f44168db/12859_2016_1125_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b5dc/4932705/3a12e21440b8/12859_2016_1125_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b5dc/4932705/e67233fee3b2/12859_2016_1125_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b5dc/4932705/a62191acd344/12859_2016_1125_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b5dc/4932705/747e676f7157/12859_2016_1125_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b5dc/4932705/c6a4e33bebc4/12859_2016_1125_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b5dc/4932705/a119f44168db/12859_2016_1125_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b5dc/4932705/3a12e21440b8/12859_2016_1125_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b5dc/4932705/e67233fee3b2/12859_2016_1125_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b5dc/4932705/a62191acd344/12859_2016_1125_Fig6_HTML.jpg

相似文献

1
Reusable, extensible, and modifiable R scripts and Kepler workflows for comprehensive single set ChIP-seq analysis.用于全面单组ChIP-seq分析的可重复使用、可扩展且可修改的R脚本和开普勒工作流程。
BMC Bioinformatics. 2016 Jul 5;17(1):270. doi: 10.1186/s12859-016-1125-3.
2
Workflows for microarray data processing in the Kepler environment.在 Kepler 环境中进行微阵列数据处理的工作流程。
BMC Bioinformatics. 2012 May 17;13:102. doi: 10.1186/1471-2105-13-102.
3
Analysis of ChIP-seq Data in R/Bioconductor.利用R/Bioconductor进行染色质免疫沉淀测序(ChIP-seq)数据分析
Methods Mol Biol. 2018;1689:195-226. doi: 10.1007/978-1-4939-7380-4_17.
4
HiChIP: a high-throughput pipeline for integrative analysis of ChIP-Seq data.HiChIP:一种用于 ChIP-Seq 数据综合分析的高通量管道。
BMC Bioinformatics. 2014 Aug 15;15(1):280. doi: 10.1186/1471-2105-15-280.
5
An Integrated Platform for Genome-wide Mapping of Chromatin States Using High-throughput ChIP-sequencing in Tumor Tissues.利用高通量ChIP测序在肿瘤组织中进行全基因组染色质状态图谱绘制的集成平台。
J Vis Exp. 2018 Apr 5(134):56972. doi: 10.3791/56972.
6
piPipes: a set of pipelines for piRNA and transposon analysis via small RNA-seq, RNA-seq, degradome- and CAGE-seq, ChIP-seq and genomic DNA sequencing.piPipes:一组通过小RNA测序、RNA测序、降解组和CAGE测序、染色质免疫沉淀测序以及基因组DNA测序进行piRNA和转座子分析的管道。
Bioinformatics. 2015 Feb 15;31(4):593-5. doi: 10.1093/bioinformatics/btu647. Epub 2014 Oct 17.
7
CIPHER: a flexible and extensive workflow platform for integrative next-generation sequencing data analysis and genomic regulatory element prediction.CIPHER:一个用于整合下一代测序数据分析和基因组调控元件预测的灵活且功能广泛的工作流程平台。
BMC Bioinformatics. 2017 Aug 8;18(1):363. doi: 10.1186/s12859-017-1770-1.
8
ChiLin: a comprehensive ChIP-seq and DNase-seq quality control and analysis pipeline.麒麟:一个全面的染色质免疫沉淀测序(ChIP-seq)和DNA酶超敏感位点测序(DNase-seq)质量控制与分析流程。
BMC Bioinformatics. 2016 Oct 3;17(1):404. doi: 10.1186/s12859-016-1274-4.
9
An integrated ChIP-seq analysis platform with customizable workflows.一个具有可定制工作流程的集成 ChIP-seq 分析平台。
BMC Bioinformatics. 2011 Jul 7;12:277. doi: 10.1186/1471-2105-12-277.
10
Nebula--a web-server for advanced ChIP-seq data analysis.星云--一个用于高级 ChIP-seq 数据分析的网络服务器。
Bioinformatics. 2012 Oct 1;28(19):2517-9. doi: 10.1093/bioinformatics/bts463. Epub 2012 Jul 24.

引用本文的文献

1
CSA: a web service for the complete process of ChIP-Seq analysis.CSA:一个用于 ChIP-Seq 分析完整流程的网络服务。
BMC Bioinformatics. 2019 Dec 24;20(Suppl 15):515. doi: 10.1186/s12859-019-3090-0.
2
RACS: rapid analysis of ChIP-Seq data for contig based genomes.RACS:基于连续基因组的 ChIP-Seq 数据的快速分析。
BMC Bioinformatics. 2019 Oct 29;20(1):533. doi: 10.1186/s12859-019-3100-2.

本文引用的文献

1
Differences among brain tumor stem cell types and fetal neural stem cells in focal regions of histone modifications and DNA methylation, broad regions of modifications, and bivalent promoters.脑肿瘤干细胞类型与胎儿神经干细胞在组蛋白修饰和DNA甲基化的局部区域、修饰的广泛区域以及双价启动子方面的差异。
BMC Genomics. 2014 Aug 27;15(1):724. doi: 10.1186/1471-2164-15-724.
2
HiChIP: a high-throughput pipeline for integrative analysis of ChIP-Seq data.HiChIP:一种用于 ChIP-Seq 数据综合分析的高通量管道。
BMC Bioinformatics. 2014 Aug 15;15(1):280. doi: 10.1186/1471-2105-15-280.
3
Motif-based analysis of large nucleotide data sets using MEME-ChIP.
使用MEME-ChIP对大型核苷酸数据集进行基于模体的分析。
Nat Protoc. 2014;9(6):1428-50. doi: 10.1038/nprot.2014.083. Epub 2014 May 22.
4
ngs.plot: Quick mining and visualization of next-generation sequencing data by integrating genomic databases.ngs.plot:通过整合基因组数据库对下一代测序数据进行快速挖掘和可视化。
BMC Genomics. 2014 Apr 15;15:284. doi: 10.1186/1471-2164-15-284.
5
Target analysis by integration of transcriptome and ChIP-seq data with BETA.使用BETA整合转录组和ChIP-seq数据进行靶标分析。
Nat Protoc. 2013 Dec;8(12):2502-15. doi: 10.1038/nprot.2013.150. Epub 2013 Nov 21.
6
Practical guidelines for the comprehensive analysis of ChIP-seq data.《ChIP-seq 数据综合分析实用指南》
PLoS Comput Biol. 2013;9(11):e1003326. doi: 10.1371/journal.pcbi.1003326. Epub 2013 Nov 14.
7
Facilitating the use of large-scale biological data and tools in the era of translational bioinformatics.促进转化生物信息学时代大规模生物数据和工具的使用。
Brief Bioinform. 2014 Nov;15(6):942-52. doi: 10.1093/bib/bbt055. Epub 2013 Aug 1.
8
Pathview: an R/Bioconductor package for pathway-based data integration and visualization.Pathview:一个基于 R/Bioconductor 的用于通路数据整合和可视化的软件包。
Bioinformatics. 2013 Jul 15;29(14):1830-1. doi: 10.1093/bioinformatics/btt285. Epub 2013 Jun 4.
9
BroadPeak: a novel algorithm for identifying broad peaks in diffuse ChIP-seq datasets.BroadPeak:一种用于识别弥散 ChIP-seq 数据集的宽峰的新算法。
Bioinformatics. 2013 Feb 15;29(4):492-3. doi: 10.1093/bioinformatics/bts722. Epub 2013 Jan 7.
10
Genome-wide localization of protein-DNA binding and histone modification by a Bayesian change-point method with ChIP-seq data.基于贝叶斯变点方法的 ChIP-seq 数据进行蛋白-DNA 结合和组蛋白修饰的全基因组定位。
PLoS Comput Biol. 2012;8(7):e1002613. doi: 10.1371/journal.pcbi.1002613. Epub 2012 Jul 26.