对ChIP-seq读取设计对基因组覆盖度、峰识别和等位基因特异性结合检测的影响进行系统评估。

Systematic evaluation of the impact of ChIP-seq read designs on genome coverage, peak identification, and allele-specific binding detection.

作者信息

Zhang Qi, Zeng Xin, Younkin Sam, Kawli Trupti, Snyder Michael P, Keleş Sündüz

机构信息

Department of Statistics, University of Nebraska Lincoln, Lincoln, Nebraska, USA.

Department of Statistics, University of Wisconsin Madison, Madison, Wisconsin, USA.

出版信息

BMC Bioinformatics. 2016 Feb 24;17:96. doi: 10.1186/s12859-016-0957-1.

DOI:10.1186/s12859-016-0957-1

PMID:26908256

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4765064/

Abstract

BACKGROUND

Chromatin immunoprecipitation followed by sequencing (ChIP-seq) experiments revolutionized genome-wide profiling of transcription factors and histone modifications. Although maturing sequencing technologies allow these experiments to be carried out with short (36-50 bps), long (75-100 bps), single-end, or paired-end reads, the impact of these read parameters on the downstream data analysis are not well understood. In this paper, we evaluate the effects of different read parameters on genome sequence alignment, coverage of different classes of genomic features, peak identification, and allele-specific binding detection.

RESULTS

We generated 101 bps paired-end ChIP-seq data for many transcription factors from human GM12878 and MCF7 cell lines. Systematic evaluations using in silico variations of these data as well as fully simulated data, revealed complex interplay between the sequencing parameters and analysis tools, and indicated clear advantages of paired-end designs in several aspects such as alignment accuracy, peak resolution, and most notably, allele-specific binding detection.

CONCLUSIONS

Our work elucidates the effect of design on the downstream analysis and provides insights to investigators in deciding sequencing parameters in ChIP-seq experiments. We present the first systematic evaluation of the impact of ChIP-seq designs on allele-specific binding detection and highlights the power of pair-end designs in such studies.

摘要

背景

染色质免疫沉淀测序（ChIP-seq）实验彻底改变了转录因子和组蛋白修饰的全基因组分析。尽管成熟的测序技术允许使用短（36 - 50碱基对）、长（75 - 100碱基对）、单端或双端 reads 进行这些实验，但这些 reads 参数对下游数据分析的影响尚未得到很好的理解。在本文中，我们评估了不同 reads 参数对基因组序列比对、不同类型基因组特征的覆盖、峰识别和等位基因特异性结合检测的影响。

结果

我们从人类 GM12878 和 MCF7 细胞系中生成了许多转录因子的101碱基对双端 ChIP-seq 数据。使用这些数据的计算机模拟变异以及完全模拟的数据进行系统评估，揭示了测序参数与分析工具之间复杂的相互作用，并表明双端设计在几个方面具有明显优势，如比对准确性、峰分辨率，最显著的是等位基因特异性结合检测。

结论

我们的工作阐明了设计对下游分析的影响，并为研究人员在 ChIP-seq 实验中决定测序参数提供了见解。我们首次对 ChIP-seq 设计对等位基因特异性结合检测的影响进行了系统评估，并强调了双端设计在此类研究中的作用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/76dc/4765064/31a9a4cdd7c0/12859_2016_957_Fig1_HTML.jpg

相似文献

Systematic evaluation of the impact of ChIP-seq read designs on genome coverage, peak identification, and allele-specific binding detection.对ChIP-seq读取设计对基因组覆盖度、峰识别和等位基因特异性结合检测的影响进行系统评估。

BMC Bioinformatics. 2016 Feb 24;17:96. doi: 10.1186/s12859-016-0957-1.

HiChIP: a high-throughput pipeline for integrative analysis of ChIP-Seq data.HiChIP：一种用于 ChIP-Seq 数据综合分析的高通量管道。

BMC Bioinformatics. 2014 Aug 15;15(1):280. doi: 10.1186/1471-2105-15-280.

Discovering transcription factor binding sites in highly repetitive regions of genomes with multi-read analysis of ChIP-Seq data.利用 ChIP-Seq 数据的多读分析技术，在基因组的高度重复区域中发现转录因子结合位点。

PLoS Comput Biol. 2011 Jul;7(7):e1002111. doi: 10.1371/journal.pcbi.1002111. Epub 2011 Jul 14.

Quantitatively profiling genome-wide patterns of histone modifications in Arabidopsis thaliana using ChIP-seq.利用染色质免疫沉淀测序（ChIP-seq）对拟南芥全基因组范围内的组蛋白修饰模式进行定量分析。

Methods Mol Biol. 2014;1112:177-93. doi: 10.1007/978-1-62703-773-0_12.

Integrative analysis of ChIP-chip and ChIP-seq dataset.芯片结合位点分析（ChIP-chip）和染色质免疫沉淀测序（ChIP-seq）数据集的综合分析。

Methods Mol Biol. 2013;1067:105-24. doi: 10.1007/978-1-62703-607-8_8.

Important biological information uncovered in previously unaligned reads from chromatin immunoprecipitation experiments (ChIP-Seq).在染色质免疫沉淀实验（ChIP-Seq）中，从之前未比对的 reads 中发现的重要生物学信息。

Sci Rep. 2015 Mar 2;5:8635. doi: 10.1038/srep08635.

Systematic evaluation of factors influencing ChIP-seq fidelity.系统评估影响 ChIP-seq 保真度的因素。

Nat Methods. 2012 Jun;9(6):609-14. doi: 10.1038/nmeth.1985. Epub 2012 Apr 22.

Identification and removal of low-complexity sites in allele-specific analysis of ChIP-seq data.鉴定和去除 ChIP-seq 数据的等位基因特异性分析中的低复杂度区域。

Bioinformatics. 2014 Jan 15;30(2):165-71. doi: 10.1093/bioinformatics/btt667. Epub 2013 Nov 18.

ChIP-seq for the Identification of Functional Elements in the Human Genome.用于鉴定人类基因组中功能元件的染色质免疫沉淀测序技术

Methods Mol Biol. 2017;1543:3-18. doi: 10.1007/978-1-4939-6716-2_1.

Use model-based Analysis of ChIP-Seq (MACS) to analyze short reads generated by sequencing protein-DNA interactions in embryonic stem cells.使用基于模型的ChIP-Seq分析方法（MACS）来分析通过对胚胎干细胞中蛋白质-DNA相互作用进行测序而产生的短序列 reads。

Methods Mol Biol. 2014;1150:81-95. doi: 10.1007/978-1-4939-0512-6_4.

引用本文的文献

Annotation of cis-regulatory-associated histone modifications in the genomes of two Thoroughbred stallions.两匹纯种公马基因组中顺式调控相关组蛋白修饰的注释

Front Genet. 2025 Feb 27;16:1534461. doi: 10.3389/fgene.2025.1534461. eCollection 2025.

INFIMA leverages multi-omics model organism data to identify effector genes of human GWAS variants.INFIMA 利用多组学模式生物数据来鉴定人类 GWAS 变异的效应基因。

Genome Biol. 2021 Aug 23;22(1):241. doi: 10.1186/s13059-021-02450-8.

Theoretical characterisation of strand cross-correlation in ChIP-seq.ChIP-seq 中链交叉关联的理论特征描述。

BMC Bioinformatics. 2020 Sep 22;21(1):417. doi: 10.1186/s12859-020-03729-6.

本文引用的文献

A Statistical Framework for the Analysis of ChIP-Seq Data.用于ChIP-Seq数据分析的统计框架

J Am Stat Assoc. 2011;106(495):891-903. doi: 10.1198/jasa.2011.ap09706. Epub 2012 Jan 24.

Denoising DNA deep sequencing data-high-throughput sequencing errors and their correction.去噪DNA深度测序数据——高通量测序错误及其校正

Brief Bioinform. 2016 Jan;17(1):154-79. doi: 10.1093/bib/bbv029. Epub 2015 May 29.

MethylC-seq library preparation for base-resolution whole-genome bisulfite sequencing.用于碱基分辨率全基因组亚硫酸氢盐测序的甲基化C序列文库制备。

Nat Protoc. 2015 Mar;10(3):475-83. doi: 10.1038/nprot.2014.114. Epub 2015 Feb 18.

In silico pooling of ChIP-seq control experiments.ChIP-seq对照实验的计算机模拟合并

PLoS One. 2014 Nov 7;9(11):e109691. doi: 10.1371/journal.pone.0109691. eCollection 2014.

CNV-guided multi-read allocation for ChIP-seq.基于 CNV 的 ChIP-seq 多读取分配

Bioinformatics. 2014 Oct 15;30(20):2860-7. doi: 10.1093/bioinformatics/btu402. Epub 2014 Jun 24.

Impact of sequencing depth in ChIP-seq experiments.测序深度对 ChIP-seq 实验的影响。

Nucleic Acids Res. 2014 May;42(9):e74. doi: 10.1093/nar/gku178. Epub 2014 Mar 5.

Sequencing depth and coverage: key considerations in genomic analyses.测序深度和覆盖度：基因组分析中的关键考虑因素。

Nat Rev Genet. 2014 Feb;15(2):121-32. doi: 10.1038/nrg3642.

Systematic discovery and characterization of regulatory motifs in ENCODE TF binding experiments.系统发现和描绘 ENCODE TF 结合实验中的调控基序。

Nucleic Acids Res. 2014 Mar;42(5):2976-87. doi: 10.1093/nar/gkt1249. Epub 2013 Dec 13.

dPeak: high resolution identification of transcription factor binding sites from PET and SET ChIP-Seq data.dPeak：从 PET 和 SET ChIP-Seq 数据中高分辨率识别转录因子结合位点。

PLoS Comput Biol. 2013;9(10):e1003246. doi: 10.1371/journal.pcbi.1003246. Epub 2013 Oct 17.

Identification of genetic variants that affect histone modifications in human cells.鉴定影响人类细胞组蛋白修饰的遗传变异。

Science. 2013 Nov 8;342(6159):747-9. doi: 10.1126/science.1242429. Epub 2013 Oct 17.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

对ChIP-seq读取设计对基因组覆盖度、峰识别和等位基因特异性结合检测的影响进行系统评估。

Systematic evaluation of the impact of ChIP-seq read designs on genome coverage, peak identification, and allele-specific binding detection.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献