钓 ChIPs：一个用于 ChIP-Seq 数据自动化基因组注释的管道。

Fish the ChIPs: a pipeline for automated genomic annotation of ChIP-Seq data.

机构信息

Department of Experimental Oncology, European Institute of Oncology (IEO), IFOM-IEO Campus, Via Adamello 16, Milan, Italy.

出版信息

Biol Direct. 2011 Oct 6;6:51. doi: 10.1186/1745-6150-6-51.

DOI:10.1186/1745-6150-6-51

PMID:21978789

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3201895/

Abstract

BACKGROUND

High-throughput sequencing is generating massive amounts of data at a pace that largely exceeds the throughput of data analysis routines. Here we introduce Fish the ChIPs (FC), a computational pipeline aimed at a broad public of users and designed to perform complete ChIP-Seq data analysis of an unlimited number of samples, thus increasing throughput, reproducibility and saving time.

RESULTS

Starting from short read sequences, FC performs the following steps: 1) quality controls, 2) alignment to a reference genome, 3) peak calling, 4) genomic annotation, 5) generation of raw signal tracks for visualization on the UCSC and IGV genome browsers. FC exploits some of the fastest and most effective tools today available. Installation on a Mac platform requires very basic computational skills while configuration and usage are supported by a user-friendly graphic user interface. Alternatively, FC can be compiled from the source code on any Unix machine and then run with the possibility of customizing each single parameter through a simple configuration text file that can be generated using a dedicated user-friendly web-form. Considering the execution time, FC can be run on a desktop machine, even though the use of a computer cluster is recommended for analyses of large batches of data. FC is perfectly suited to work with data coming from Illumina Solexa Genome Analyzers or ABI SOLiD and its usage can potentially be extended to any sequencing platform.

CONCLUSIONS

Compared to existing tools, FC has two main advantages that make it suitable for a broad range of users. First of all, it can be installed and run by wet biologists on a Mac machine. Besides it can handle an unlimited number of samples, being convenient for large analyses. In this context, computational biologists can increase reproducibility of their ChIP-Seq data analyses while saving time for downstream analyses.

摘要

背景

高通量测序技术以远超数据分析处理速度的步伐产生了海量数据。在此，我们引入了 FC（ChIPs 的 Fish），这是一个面向广泛用户群体的计算流程，旨在对无限数量的样本进行完整的 ChIP-Seq 数据分析，从而提高通量、重现性并节省时间。

结果

从短读序列开始，FC 执行以下步骤：1）质量控制，2）比对参考基因组，3）峰调用，4）基因组注释，5）生成原始信号轨迹，以便在 UCSC 和 IGV 基因组浏览器上可视化。FC 利用了当今最快、最有效的工具。在 Mac 平台上安装只需要非常基本的计算技能，而配置和使用则通过用户友好的图形用户界面得到支持。或者，FC 可以从任何 Unix 机器的源代码编译，然后通过一个简单的配置文本文件进行定制，该文件可以使用专用的用户友好的 web 表单生成。考虑到执行时间，FC 可以在桌面机器上运行，尽管对于大量数据的分析，建议使用计算机集群。FC 非常适合处理来自 Illumina Solexa 基因组分析仪或 ABI SOLiD 的数据，并且其用途可以潜在地扩展到任何测序平台。

结论

与现有工具相比，FC 具有两个主要优势，使其适合广泛的用户群体。首先，它可以由 Mac 机器上的湿生物学家安装和运行。其次，它可以处理无限数量的样本，非常适合大型分析。在这种情况下，计算生物学家可以提高他们的 ChIP-Seq 数据分析的重现性，同时为下游分析节省时间。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9247/3201895/6a381d695e4c/1745-6150-6-51-1.jpg

相似文献

Fish the ChIPs: a pipeline for automated genomic annotation of ChIP-Seq data.

Biol Direct. 2011 Oct 6;6:51. doi: 10.1186/1745-6150-6-51.

The ChIP-Seq tools and web server: a resource for analyzing ChIP-seq and other types of genomic data.

BMC Genomics. 2016 Nov 18;17(1):938. doi: 10.1186/s12864-016-3288-8.

An integrated software system for analyzing ChIP-chip and ChIP-seq data.

Nat Biotechnol. 2008 Nov;26(11):1293-300. doi: 10.1038/nbt.1505. Epub 2008 Nov 2.

Cistrome: an integrative platform for transcriptional regulation studies.

Genome Biol. 2011 Aug 22;12(8):R83. doi: 10.1186/gb-2011-12-8-r83.

HiChIP: a high-throughput pipeline for integrative analysis of ChIP-Seq data.

BMC Bioinformatics. 2014 Aug 15;15(1):280. doi: 10.1186/1471-2105-15-280.

mu-CS: an extension of the TM4 platform to manage Affymetrix binary data.

BMC Bioinformatics. 2010 Jun 10;11:315. doi: 10.1186/1471-2105-11-315.

GENAVi: a shiny web application for gene expression normalization, analysis and visualization.

BMC Genomics. 2019 Oct 16;20(1):745. doi: 10.1186/s12864-019-6073-7.

PeakAnalyzer: genome-wide annotation of chromatin binding and modification loci.

BMC Bioinformatics. 2010 Aug 6;11:415. doi: 10.1186/1471-2105-11-415.

ChIPpeakAnno: a Bioconductor package to annotate ChIP-seq and ChIP-chip data.

BMC Bioinformatics. 2010 May 11;11:237. doi: 10.1186/1471-2105-11-237.

Nebula--a web-server for advanced ChIP-seq data analysis.

Bioinformatics. 2012 Oct 1;28(19):2517-9. doi: 10.1093/bioinformatics/bts463. Epub 2012 Jul 24.

引用本文的文献

Dynamics in protein translation sustaining T cell preparedness.

Nat Immunol. 2020 Aug;21(8):927-937. doi: 10.1038/s41590-020-0714-5. Epub 2020 Jul 6.

Targeting the scaffolding role of LSD1 (KDM1A) poises acute myeloid leukemia cells for retinoic acid-induced differentiation.

Sci Adv. 2020 Apr 8;6(15):eaax2746. doi: 10.1126/sciadv.aax2746. eCollection 2020 Apr.

An immunoregulatory and tissue-residency program modulated by c-MAF in human T17 cells.

Nat Immunol. 2018 Oct;19(10):1126-1136. doi: 10.1038/s41590-018-0200-5. Epub 2018 Sep 10.

loss cooperates with overexpression to promote lymphoma in mice.

Blood. 2017 May 11;129(19):2645-2656. doi: 10.1182/blood-2016-08-733469. Epub 2017 Mar 13.

ChiLin: a comprehensive ChIP-seq and DNase-seq quality control and analysis pipeline.

BMC Bioinformatics. 2016 Oct 3;17(1):404. doi: 10.1186/s12859-016-1274-4.

Reusable, extensible, and modifiable R scripts and Kepler workflows for comprehensive single set ChIP-seq analysis.

BMC Bioinformatics. 2016 Jul 5;17(1):270. doi: 10.1186/s12859-016-1125-3.

A dual cis-regulatory code links IRF8 to constitutive and inducible gene expression in macrophages.

Genes Dev. 2015 Feb 15;29(4):394-408. doi: 10.1101/gad.257592.114. Epub 2015 Jan 30.

HiChIP: a high-throughput pipeline for integrative analysis of ChIP-Seq data.

BMC Bioinformatics. 2014 Aug 15;15(1):280. doi: 10.1186/1471-2105-15-280.

Requirement for the histone deacetylase Hdac3 for the inflammatory gene expression program in macrophages.

Proc Natl Acad Sci U S A. 2012 Oct 16;109(42):E2865-74. doi: 10.1073/pnas.1121131109. Epub 2012 Jul 16.

本文引用的文献

An integrated pipeline for the genome-wide analysis of transcription factor binding sites from ChIP-Seq.

PLoS One. 2011 Feb 16;6(2):e16432. doi: 10.1371/journal.pone.0016432.

On the future of genomic data.

Science. 2011 Feb 11;331(6018):728-9. doi: 10.1126/science.1197891.

Integrative genomics viewer.

Nat Biotechnol. 2011 Jan;29(1):24-6. doi: 10.1038/nbt.1754.

A manually curated ChIP-seq benchmark demonstrates room for improvement in current peak-finder programs.

Nucleic Acids Res. 2011 Mar;39(4):e25. doi: 10.1093/nar/gkq1187. Epub 2010 Nov 26.

The UCSC Genome Browser database: update 2011.

Nucleic Acids Res. 2011 Jan;39(Database issue):D876-82. doi: 10.1093/nar/gkq963. Epub 2010 Oct 18.

ChIPpeakAnno: a Bioconductor package to annotate ChIP-seq and ChIP-chip data.

BMC Bioinformatics. 2010 May 11;11:237. doi: 10.1186/1471-2105-11-237.

Identification and characterization of enhancers controlling the inflammatory gene expression program in macrophages.

Immunity. 2010 Mar 26;32(3):317-28. doi: 10.1016/j.immuni.2010.02.008. Epub 2010 Mar 4.

Sole-Search: an integrated analysis program for peak detection and functional annotation using ChIP-seq data.

Nucleic Acids Res. 2010 Jan;38(3):e13. doi: 10.1093/nar/gkp1012. Epub 2009 Nov 11.

Sense from sequence reads: methods for alignment and assembly.

Nat Methods. 2009 Nov;6(11 Suppl):S6-S12. doi: 10.1038/nmeth.1376.

CEAS: cis-regulatory element annotation system.

Bioinformatics. 2009 Oct 1;25(19):2605-6. doi: 10.1093/bioinformatics/btp479. Epub 2009 Aug 18.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

钓 ChIPs：一个用于 ChIP-Seq 数据自动化基因组注释的管道。

Fish the ChIPs: a pipeline for automated genomic annotation of ChIP-Seq data.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献