Suppr超能文献

SigSeeker:一种用于构建表观遗传特征的峰调用集成方法。

SigSeeker: a peak-calling ensemble approach for constructing epigenetic signatures.

机构信息

Genetics and Molecular Biology Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA.

Translational and Functional Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA.

出版信息

Bioinformatics. 2017 Sep 1;33(17):2615-2621. doi: 10.1093/bioinformatics/btx276.

Abstract

MOTIVATION

Epigenetic data are invaluable when determining the regulatory programs governing a cell. Based on use of next-generation sequencing data for characterizing epigenetic marks and transcription factor binding, numerous peak-calling approaches have been developed to determine sites of genomic significance in these data. Such analyses can produce a large number of false positive predictions, suggesting that sites supported by multiple algorithms provide a stronger foundation for inferring and characterizing regulatory programs associated with the epigenetic data. Few methodologies integrate epigenetic based predictions of multiple approaches when combining profiles generated by different tools.

RESULTS

The SigSeeker peak-calling ensemble uses multiple tools to identify peaks, and with user-defined thresholds for peak overlap and signal strength it retains only those peaks that are concordant across multiple tools. Peaks predicted to be co-localized by only a very small number of tools, discovered to be only marginally overlapping, or found to represent significant outliers to the approximation model are removed from the results, providing concise and high quality epigenetic datasets. SigSeeker has been validated using established benchmarks for transcription factor binding and histone modification ChIP-Seq data. These comparisons indicate that the quality of our ensemble technique exceeds that of single tool approaches, enhances existing peak-calling ensembles, and results in epigenetic profiles of higher confidence.

AVAILABILITY AND IMPLEMENTATION

http://sigseeker.org.

CONTACT

lichtenbergj@mail.nih.gov.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

在确定调控细胞的调控程序时,表观遗传数据是非常宝贵的。基于使用下一代测序数据来描述表观遗传标记和转录因子结合,已经开发了许多峰调用方法来确定这些数据中基因组意义的位点。这些分析可能会产生大量的假阳性预测,这表明由多个算法支持的位点为推断和描述与表观遗传数据相关的调控程序提供了更强的基础。当结合不同工具生成的图谱时,很少有方法将多种方法的基于表观遗传的预测进行整合。

结果

SigSeeker 峰调用集成使用多种工具来识别峰,并且使用用户定义的峰重叠和信号强度阈值,仅保留那些在多个工具中一致的峰。仅被极少数工具预测为共定位的峰、发现仅略微重叠的峰,或被发现是近似模型的显著离群值的峰都从结果中删除,从而提供简洁且高质量的表观遗传数据集。SigSeeker 已经使用转录因子结合和组蛋白修饰 ChIP-Seq 数据的既定基准进行了验证。这些比较表明,我们的集成技术的质量优于单一工具方法,增强了现有的峰调用集成,并导致更高置信度的表观遗传图谱。

可用性和实现

http://sigseeker.org。

联系方式

lichtenbergj@mail.nih.gov

补充信息

补充数据可在 Bioinformatics 在线获得。

相似文献

1
SigSeeker: a peak-calling ensemble approach for constructing epigenetic signatures.
Bioinformatics. 2017 Sep 1;33(17):2615-2621. doi: 10.1093/bioinformatics/btx276.
2
fCCAC: functional canonical correlation analysis to evaluate covariance between nucleic acid sequencing datasets.
Bioinformatics. 2017 Mar 1;33(5):746-748. doi: 10.1093/bioinformatics/btw724.
3
PePr: a peak-calling prioritization pipeline to identify consistent or differential peaks from replicated ChIP-Seq data.
Bioinformatics. 2014 Sep 15;30(18):2568-75. doi: 10.1093/bioinformatics/btu372. Epub 2014 Jun 3.
4
Ritornello: high fidelity control-free chromatin immunoprecipitation peak calling.
Nucleic Acids Res. 2017 Dec 1;45(21):e173. doi: 10.1093/nar/gkx799.
5
Sensitive and robust assessment of ChIP-seq read distribution using a strand-shift profile.
Bioinformatics. 2018 Jul 15;34(14):2356-2363. doi: 10.1093/bioinformatics/bty137.
7
RECAP reveals the true statistical significance of ChIP-seq peak calls.
Bioinformatics. 2019 Oct 1;35(19):3592-3598. doi: 10.1093/bioinformatics/btz150.
8
NEXT-peak: a normal-exponential two-peak model for peak-calling in ChIP-seq data.
BMC Genomics. 2013 May 25;14:349. doi: 10.1186/1471-2164-14-349.
9
HiChIP-Peaks: a HiChIP peak calling algorithm.
Bioinformatics. 2020 Jun 1;36(12):3625-3631. doi: 10.1093/bioinformatics/btaa202.
10
Semi-supervised peak calling with SPAN and JBR genome browser.
Bioinformatics. 2021 Nov 18;37(22):4235-4237. doi: 10.1093/bioinformatics/btab376.

引用本文的文献

本文引用的文献

1
Optimizing ChIP-seq peak detectors using visual labels and supervised machine learning.
Bioinformatics. 2017 Feb 15;33(4):491-499. doi: 10.1093/bioinformatics/btw672.
2
Detecting broad domains and narrow peaks in ChIP-seq data with hiddenDomains.
BMC Bioinformatics. 2016 Mar 24;17:144. doi: 10.1186/s12859-016-0991-z.
3
A comprehensive comparison of tools for differential ChIP-seq analysis.
Brief Bioinform. 2016 Nov;17(6):953-966. doi: 10.1093/bib/bbv110. Epub 2016 Jan 13.
4
ChIPseeker: an R/Bioconductor package for ChIP peak annotation, comparison and visualization.
Bioinformatics. 2015 Jul 15;31(14):2382-3. doi: 10.1093/bioinformatics/btv145. Epub 2015 Mar 11.
5
A comparative encyclopedia of DNA elements in the mouse genome.
Nature. 2014 Nov 20;515(7527):355-64. doi: 10.1038/nature13992.
6
BEDTools: The Swiss-Army Tool for Genome Feature Analysis.
Curr Protoc Bioinformatics. 2014 Sep 8;47:11.12.1-34. doi: 10.1002/0471250953.bi1112s47.
7
Peak Finder Metaserver - a novel application for finding peaks in ChIP-seq data.
BMC Bioinformatics. 2013 Sep 23;14:280. doi: 10.1186/1471-2105-14-280.
8
The PinkThing for analysing ChIP profiling data in their genomic context.
BMC Res Notes. 2013 Apr 4;6:133. doi: 10.1186/1756-0500-6-133.
9
ChIP-seq guidelines and practices of the ENCODE and modENCODE consortia.
Genome Res. 2012 Sep;22(9):1813-31. doi: 10.1101/gr.136184.111.
10
An integrated encyclopedia of DNA elements in the human genome.
Nature. 2012 Sep 6;489(7414):57-74. doi: 10.1038/nature11247.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验