• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

CNN-Peaks:使用卷积神经网络进行 ChIP-Seq 峰检测的管道,该网络模仿人类视觉检查。

CNN-Peaks: ChIP-Seq peak detection pipeline using convolutional neural networks that imitate human visual inspection.

机构信息

School of Computer Science and Engineering, Pusan National University, Busan, 46241, South Korea.

Department of Genetics, Stanford University, Stanford, 94305, USA.

出版信息

Sci Rep. 2020 May 13;10(1):7933. doi: 10.1038/s41598-020-64655-4.

DOI:10.1038/s41598-020-64655-4
PMID:32404971
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7220942/
Abstract

ChIP-seq is one of the core experimental resources available to understand genome-wide epigenetic interactions and identify the functional elements associated with diseases. The analysis of ChIP-seq data is important but poses a difficult computational challenge, due to the presence of irregular noise and bias on various levels. Although many peak-calling methods have been developed, the current computational tools still require, in some cases, human manual inspection using data visualization. However, the huge volumes of ChIP-seq data make it almost impossible for human researchers to manually uncover all the peaks. Recently developed convolutional neural networks (CNN), which are capable of achieving human-like classification accuracy, can be applied to this challenging problem. In this study, we design a novel supervised learning approach for identifying ChIP-seq peaks using CNNs, and integrate it into a software pipeline called CNN-Peaks. We use data labeled by human researchers who annotate the presence or absence of peaks in some genomic segments, as training data for our model. The trained model is then applied to predict peaks in previously unseen genomic segments from multiple ChIP-seq datasets including benchmark datasets commonly used for validation of peak calling methods. We observe a performance superior to that of previous methods.

摘要

ChIP-seq 是一种用于了解全基因组表观遗传相互作用并识别与疾病相关的功能元件的核心实验资源。ChIP-seq 数据的分析很重要,但由于存在各种级别的不规则噪声和偏差,因此具有一定的计算挑战性。尽管已经开发了许多峰调用方法,但当前的计算工具在某些情况下仍然需要使用数据可视化进行人工手动检查。然而,庞大的 ChIP-seq 数据量使得人类研究人员几乎不可能手动发现所有的峰。最近开发的卷积神经网络(CNN),能够达到类似人类的分类准确性,可以应用于这个具有挑战性的问题。在这项研究中,我们设计了一种使用 CNN 识别 ChIP-seq 峰的新型监督学习方法,并将其集成到一个名为 CNN-Peaks 的软件管道中。我们使用人类研究人员标记的、在某些基因组片段中存在或不存在峰的数据作为训练数据。然后,将训练好的模型应用于从多个 ChIP-seq 数据集(包括常用的峰调用方法验证基准数据集)中预测以前未见过的基因组片段中的峰。我们观察到的性能优于以前的方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6eb0/7220942/b1887f52fe34/41598_2020_64655_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6eb0/7220942/89045c4242c4/41598_2020_64655_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6eb0/7220942/9977554cb1f3/41598_2020_64655_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6eb0/7220942/8db0e2e31a51/41598_2020_64655_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6eb0/7220942/8ce2b5e4dd30/41598_2020_64655_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6eb0/7220942/6c2b819b94f2/41598_2020_64655_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6eb0/7220942/d34abac371f0/41598_2020_64655_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6eb0/7220942/8b3a74c4b52c/41598_2020_64655_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6eb0/7220942/b1887f52fe34/41598_2020_64655_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6eb0/7220942/89045c4242c4/41598_2020_64655_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6eb0/7220942/9977554cb1f3/41598_2020_64655_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6eb0/7220942/8db0e2e31a51/41598_2020_64655_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6eb0/7220942/8ce2b5e4dd30/41598_2020_64655_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6eb0/7220942/6c2b819b94f2/41598_2020_64655_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6eb0/7220942/d34abac371f0/41598_2020_64655_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6eb0/7220942/8b3a74c4b52c/41598_2020_64655_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6eb0/7220942/b1887f52fe34/41598_2020_64655_Fig8_HTML.jpg

相似文献

1
CNN-Peaks: ChIP-Seq peak detection pipeline using convolutional neural networks that imitate human visual inspection.CNN-Peaks:使用卷积神经网络进行 ChIP-Seq 峰检测的管道,该网络模仿人类视觉检查。
Sci Rep. 2020 May 13;10(1):7933. doi: 10.1038/s41598-020-64655-4.
2
An improved ChIP-seq peak detection system for simultaneously identifying post-translational modified transcription factors by combinatorial fusion, using SUMOylation as an example.一种改良的 ChIP-seq 峰检测系统,用于通过组合融合,以 SUMOylation 为例,同时鉴定翻译后修饰的转录因子。
BMC Genomics. 2014;15 Suppl 1(Suppl 1):S1. doi: 10.1186/1471-2164-15-S1-S1. Epub 2014 Jan 24.
3
Learning Enhancer-Gene associations from Bulk Transcriptomic and Epigenetic Sequencing Data with STITCHIT.利用 STITCHIT 从批量转录组和表观遗传测序数据中学习增强子-基因关联。
Methods Mol Biol. 2025;2856:341-356. doi: 10.1007/978-1-0716-4136-1_21.
4
Unified Analysis of Multiple ChIP-Seq Datasets.多个 ChIP-Seq 数据集的统一分析。
Methods Mol Biol. 2021;2198:451-465. doi: 10.1007/978-1-0716-0876-0_33.
5
Crunch: integrated processing and modeling of ChIP-seq data in terms of regulatory motifs.Crunch:基于调控基序对 ChIP-seq 数据进行集成处理和建模。
Genome Res. 2019 Jul;29(7):1164-1177. doi: 10.1101/gr.239319.118. Epub 2019 May 28.
6
Chromatin Immunoprecipitation Followed by Next-Generation Sequencing (ChIP-Seq) Analysis in Ewing Sarcoma.染色质免疫沉淀结合下一代测序(ChIP-Seq)分析在尤文肉瘤中的应用。
Methods Mol Biol. 2021;2226:265-284. doi: 10.1007/978-1-0716-1020-6_21.
7
RSAT::Plants: Motif Discovery in ChIP-Seq Peaks of Plant Genomes.RSAT::植物:植物基因组ChIP-Seq峰中的基序发现
Methods Mol Biol. 2016;1482:297-322. doi: 10.1007/978-1-4939-6396-6_19.
8
A single ChIP-seq dataset is sufficient for comprehensive analysis of motifs co-occurrence with MCOT package.单个 ChIP-seq 数据集足以使用 MCOT 包全面分析与 MOTF 共现的情况。
Nucleic Acids Res. 2019 Dec 2;47(21):e139. doi: 10.1093/nar/gkz800.
9
Deep-learning optimized DEOCSU suite provides an iterable pipeline for accurate ChIP-exo peak calling.深度学习优化的 DEOCSU 套件为准确的 ChIP-exo 峰调用提供了一个可迭代的流水线。
Brief Bioinform. 2023 Mar 19;24(2). doi: 10.1093/bib/bbad024.
10
A novel statistical method for quantitative comparison of multiple ChIP-seq datasets.一种用于多个ChIP-seq数据集定量比较的新型统计方法。
Bioinformatics. 2015 Jun 15;31(12):1889-96. doi: 10.1093/bioinformatics/btv094. Epub 2015 Feb 13.

引用本文的文献

1
Peak analysis of cell-free RNA finds recurrently protected narrow regions with clinical potential.游离RNA的峰值分析发现了具有临床潜力的反复出现的受保护狭窄区域。
Genome Biol. 2025 May 8;26(1):119. doi: 10.1186/s13059-025-03590-x.
2
Improved quality metrics for association and reproducibility in chromatin accessibility data using mutual information.利用互信息提高染色质可及性数据关联和可重复性的质量指标。
BMC Bioinformatics. 2023 Nov 22;24(1):441. doi: 10.1186/s12859-023-05553-0.
3
Unsupervised contrastive peak caller for ATAC-seq.无监督对比峰 caller 用于 ATAC-seq。

本文引用的文献

1
Optimizing ChIP-seq peak detectors using visual labels and supervised machine learning.使用视觉标签和监督式机器学习优化染色质免疫沉淀测序(ChIP-seq)峰检测工具
Bioinformatics. 2017 Feb 15;33(4):491-499. doi: 10.1093/bioinformatics/btw672.
2
Histone modifiers and marks define heterogeneous groups of colorectal carcinomas and affect responses to HDAC inhibitors in vitro.组蛋白修饰因子和标记物定义了结直肠癌的异质性群体,并影响体外对组蛋白去乙酰化酶抑制剂的反应。
Am J Cancer Res. 2016 Feb 15;6(3):664-76. eCollection 2016.
3
NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins.
Genome Res. 2023 Jul;33(7):1133-1144. doi: 10.1101/gr.277677.123. Epub 2023 May 22.
4
Unsupervised Contrastive Peak Caller for ATAC-seq.用于ATAC序列的无监督对比峰检测工具
bioRxiv. 2023 Jan 8:2023.01.07.523108. doi: 10.1101/2023.01.07.523108.
5
Exploitation of epigenetic variation of crop wild relatives for crop improvement and agrobiodiversity preservation.利用作物野生近缘种的表观遗传变异进行作物改良和农业生物多样性保护。
Theor Appl Genet. 2022 Nov;135(11):3987-4003. doi: 10.1007/s00122-022-04122-y. Epub 2022 Jun 9.
6
ChIP-BIT2: a software tool to detect weak binding events using a Bayesian integration approach.ChIP-BIT2:一种使用贝叶斯整合方法检测弱结合事件的软件工具。
BMC Bioinformatics. 2021 Apr 15;22(1):193. doi: 10.1186/s12859-021-04108-5.
美国国立生物技术信息中心参考序列(RefSeq):一个经过整理的基因组、转录本和蛋白质的非冗余序列数据库。
Nucleic Acids Res. 2007 Jan;35(Database issue):D61-5. doi: 10.1093/nar/gkl842. Epub 2006 Nov 27.