利用下一代测序数据高效识别全基因组变化。

Efficiently identifying genome-wide changes with next-generation sequencing data.

机构信息

Biostatistics Branch, National Institute of Environmental Health Sciences, RTP, NC 27709, USA.

出版信息

Nucleic Acids Res. 2011 Oct;39(19):e130. doi: 10.1093/nar/gkr592. Epub 2011 Jul 29.

DOI:10.1093/nar/gkr592

PMID:21803788

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3201882/

Abstract

We propose a new and effective statistical framework for identifying genome-wide differential changes in epigenetic marks with ChIP-seq data or gene expression with mRNA-seq data, and we develop a new software tool EpiCenter that can efficiently perform data analysis. The key features of our framework are: (i) providing multiple normalization methods to achieve appropriate normalization under different scenarios, (ii) using a sequence of three statistical tests to eliminate background regions and to account for different sources of variation and (iii) allowing adjustment for multiple testing to control false discovery rate (FDR) or family-wise type I error. Our software EpiCenter can perform multiple analytic tasks including: (i) identifying genome-wide epigenetic changes or differentially expressed genes, (ii) finding transcription factor binding sites and (iii) converting multiple-sample sequencing data into a single read-count data matrix. By simulation, we show that our framework achieves a low FDR consistently over a broad range of read coverage and biological variation. Through two real examples, we demonstrate the effectiveness of our framework and the usages of our tool. In particular, we show that our novel and robust 'parsimony' normalization method is superior to the widely-used 'tagRatio' method. Our software EpiCenter is freely available to the public.

摘要

我们提出了一个新的、有效的统计框架，用于识别 ChIP-seq 数据中的全基因组差异表观遗传标记或 mRNA-seq 数据中的基因表达差异，并开发了一个新的软件工具 EpiCenter，可高效地进行数据分析。我们的框架的主要特点是：（i）提供多种归一化方法，以便在不同情况下实现适当的归一化；（ii）使用三个统计测试序列，以消除背景区域，并考虑不同来源的变化；（iii）允许进行多次测试调整，以控制假发现率（FDR）或一类错误。我们的软件 EpiCenter 可以执行多种分析任务，包括：（i）识别全基因组的表观遗传变化或差异表达基因；（ii）寻找转录因子结合位点；（iii）将多样本测序数据转换为单个读取计数数据矩阵。通过模拟，我们表明我们的框架在广泛的读取覆盖范围和生物变异范围内始终实现较低的 FDR。通过两个实际示例，我们展示了我们的框架的有效性和工具的用途。特别是，我们表明我们新颖而稳健的“简约”归一化方法优于广泛使用的“标签比”方法。我们的软件 EpiCenter 可供公众免费使用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b4a4/3201882/0c58c4996f82/gkr592f1.jpg

相似文献

Efficiently identifying genome-wide changes with next-generation sequencing data.

Nucleic Acids Res. 2011 Oct;39(19):e130. doi: 10.1093/nar/gkr592. Epub 2011 Jul 29.

Is this the right normalization? A diagnostic tool for ChIP-seq normalization.

BMC Bioinformatics. 2015 May 9;16:150. doi: 10.1186/s12859-015-0579-z.

Role of ChIP-seq in the discovery of transcription factor binding sites, differential gene regulation mechanism, epigenetic marks and beyond.

Cell Cycle. 2014;13(18):2847-52. doi: 10.4161/15384101.2014.949201.

A novel statistical method for quantitative comparison of multiple ChIP-seq datasets.

Bioinformatics. 2015 Jun 15;31(12):1889-96. doi: 10.1093/bioinformatics/btv094. Epub 2015 Feb 13.

A comparison of per sample global scaling and per gene normalization methods for differential expression analysis of RNA-seq data.

PLoS One. 2017 May 1;12(5):e0176185. doi: 10.1371/journal.pone.0176185. eCollection 2017.

Seqinspector: position-based navigation through the ChIP-seq data landscape to identify gene expression regulators.

BMC Bioinformatics. 2016 Feb 12;17:85. doi: 10.1186/s12859-016-0938-4.

csaw: a Bioconductor package for differential binding analysis of ChIP-seq data using sliding windows.

Nucleic Acids Res. 2016 Mar 18;44(5):e45. doi: 10.1093/nar/gkv1191. Epub 2015 Nov 17.

Genome-Wide Identification of Transcription Factor-Binding Sites in Quiescent Adult Neural Stem Cells.

Methods Mol Biol. 2018;1686:265-286. doi: 10.1007/978-1-4939-7371-2_19.

Simultaneous Targeted Methylation Sequencing (sTM-Seq).

Curr Protoc Hum Genet. 2019 Apr;101(1):e81. doi: 10.1002/cphg.81. Epub 2019 Jan 8.

MethGo: a comprehensive tool for analyzing whole-genome bisulfite sequencing data.

BMC Genomics. 2015;16 Suppl 12(Suppl 12):S11. doi: 10.1186/1471-2164-16-S12-S11. Epub 2015 Dec 9.

引用本文的文献

Comprehensive assessment of differential ChIP-seq tools guides optimal algorithm selection.

Genome Biol. 2022 May 24;23(1):119. doi: 10.1186/s13059-022-02686-y.

AnnoMiner is a new web-tool to integrate epigenetics, transcription factor occupancy and transcriptomics data to predict transcriptional regulators.

Sci Rep. 2021 Jul 29;11(1):15463. doi: 10.1038/s41598-021-94805-1.

Global transcriptomic profiling of microcystin-LR or -RR treated hepatocytes (HepaRG).

Toxicon X. 2020 Oct 7;8:100060. doi: 10.1016/j.toxcx.2020.100060. eCollection 2020 Dec.

Coordinated regulation of Rel expression by MAP3K4, CBP, and HDAC6 controls phenotypic switching.

Commun Biol. 2020 Aug 28;3(1):475. doi: 10.1038/s42003-020-01200-z.

Characterization of the Fundulus heteroclitus embryo transcriptional response and development of a gene expression-based fingerprint of exposure for the alternative flame retardant, TBPH (bis (2-ethylhexyl)-tetrabromophthalate).

Environ Pollut. 2019 Apr;247:696-705. doi: 10.1016/j.envpol.2019.01.010. Epub 2019 Jan 10.

Mediator complex component MED13 regulates zygotic genome activation and is required for postimplantation development in the mouse.

Biol Reprod. 2018 Apr 1;98(4):449-464. doi: 10.1093/biolre/ioy004.

Quantitative analysis of ChIP-seq data uncovers dynamic and sustained H3K4me3 and H3K27me3 modulation in cancer cells under hypoxia.

Epigenetics Chromatin. 2016 Nov 1;9:48. doi: 10.1186/s13072-016-0090-4. eCollection 2016.

Expression and methylation data from SLE patient and healthy control blood samples subdivided with respect to ARID3a levels.

Data Brief. 2016 Aug 31;9:213-9. doi: 10.1016/j.dib.2016.08.049. eCollection 2016 Dec.

ABSSeq: a new RNA-Seq analysis method based on modelling absolute expression differences.

BMC Genomics. 2016 Aug 4;17:541. doi: 10.1186/s12864-016-2848-2.

Deficiency of the placenta- and yolk sac-specific tristetraprolin family member ZFP36L3 identifies likely mRNA targets and an unexpected link to placental iron metabolism.

Development. 2016 Apr 15;143(8):1424-33. doi: 10.1242/dev.130369. Epub 2016 Mar 7.

本文引用的文献

MAP3K4/CBP-regulated H2B acetylation controls epithelial-mesenchymal transition in trophoblast stem cells.

Cell Stem Cell. 2011 May 6;8(5):525-37. doi: 10.1016/j.stem.2011.03.008.

Differential expression analysis for sequence count data.

Genome Biol. 2010;11(10):R106. doi: 10.1186/gb-2010-11-10-r106. Epub 2010 Oct 27.

PICS: probabilistic inference for ChIP-seq.

Biometrics. 2011 Mar;67(1):151-63. doi: 10.1111/j.1541-0420.2010.01441.x.

Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation.

Nat Biotechnol. 2010 May;28(5):511-5. doi: 10.1038/nbt.1621. Epub 2010 May 2.

Assessing serotonin receptor mRNA editing frequency by a novel ultra high-throughput sequencing method.

Nucleic Acids Res. 2010 Jun;38(10):e118. doi: 10.1093/nar/gkq107. Epub 2010 Feb 25.

A practical comparison of methods for detecting transcription factor binding sites in ChIP-seq experiments.

BMC Genomics. 2009 Dec 18;10:618. doi: 10.1186/1471-2164-10-618.

edgeR: a Bioconductor package for differential expression analysis of digital gene expression data.

Bioinformatics. 2010 Jan 1;26(1):139-40. doi: 10.1093/bioinformatics/btp616. Epub 2009 Nov 11.

DEGseq: an R package for identifying differentially expressed genes from RNA-seq data.

Bioinformatics. 2010 Jan 1;26(1):136-8. doi: 10.1093/bioinformatics/btp612. Epub 2009 Oct 24.

Computation for ChIP-seq and RNA-seq studies.

Nat Methods. 2009 Nov;6(11 Suppl):S22-32. doi: 10.1038/nmeth.1371.

Next-generation gap.

Nat Methods. 2009 Nov;6(11 Suppl):S2-5. doi: 10.1038/nmeth.f.268.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

利用下一代测序数据高效识别全基因组变化。

Efficiently identifying genome-wide changes with next-generation sequencing data.

机构信息

Biostatistics Branch, National Institute of Environmental Health Sciences, RTP, NC 27709, USA.

出版信息

Nucleic Acids Res. 2011 Oct;39(19):e130. doi: 10.1093/nar/gkr592. Epub 2011 Jul 29.

DOI:10.1093/nar/gkr592

PMID:21803788

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3201882/

Abstract

摘要

利用下一代测序数据高效识别全基因组变化。

Efficiently identifying genome-wide changes with next-generation sequencing data.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

利用下一代测序数据高效识别全基因组变化。

Efficiently identifying genome-wide changes with next-generation sequencing data.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献