惠灵顿：一种从 DNase-seq 数据中准确识别数字基因组足迹的新方法。

Wellington: a novel method for the accurate identification of digital genomic footprints from DNase-seq data.

机构信息

Warwick Systems Biology Centre, University of Warwick, Coventry, CV4 7AL, United Kingdom, School of Cancer Sciences, Institute of Biomedical Research, College of Medical and Dental Sciences, University of Birmingham, Birmingham, B15 2TT, United Kingdom, Department of Statistics, University of Warwick, Coventry, CV4 7AL, United Kingdom and School of Immunity and Infection, Institute of Biomedical Research, College of Medical and Dental Sciences, University of Birmingham, Birmingham, B15 2TT, United Kingdom.

出版信息

Nucleic Acids Res. 2013 Nov;41(21):e201. doi: 10.1093/nar/gkt850. Epub 2013 Sep 25.

DOI:10.1093/nar/gkt850

PMID:24071585

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3834841/

Abstract

The expression of eukaryotic genes is regulated by cis-regulatory elements such as promoters and enhancers, which bind sequence-specific DNA-binding proteins. One of the great challenges in the gene regulation field is to characterise these elements. This involves the identification of transcription factor (TF) binding sites within regulatory elements that are occupied in a defined regulatory context. Digestion with DNase and the subsequent analysis of regions protected from cleavage (DNase footprinting) has for many years been used to identify specific binding sites occupied by TFs at individual cis-elements with high resolution. This methodology has recently been adapted for high-throughput sequencing (DNase-seq). In this study, we describe an imbalance in the DNA strand-specific alignment information of DNase-seq data surrounding protein-DNA interactions that allows accurate prediction of occupied TF binding sites. Our study introduces a novel algorithm, Wellington, which considers the imbalance in this strand-specific information to efficiently identify DNA footprints. This algorithm significantly enhances specificity by reducing the proportion of false positives and requires significantly fewer predictions than previously reported methods to recapitulate an equal amount of ChIP-seq data. We also provide an open-source software package, pyDNase, which implements the Wellington algorithm to interface with DNase-seq data and expedite analyses.

摘要

真核基因的表达受顺式调控元件（如启动子和增强子）的调控，这些元件与序列特异性 DNA 结合蛋白结合。基因调控领域的一个重大挑战是对这些元件进行特征描述。这涉及到鉴定在特定调控环境中被占据的调节元件内的转录因子（TF）结合位点。用 DNA 酶消化，然后分析被保护免受切割的区域（DNase 足迹法）多年来一直用于以高分辨率鉴定单个顺式元件上 TF 占据的特定结合位点。该方法最近已被用于高通量测序（DNase-seq）。在这项研究中，我们描述了围绕蛋白-DNA 相互作用的 DNase-seq 数据中 DNA 链特异性比对信息的不平衡，这使得准确预测被占据的 TF 结合位点成为可能。我们的研究引入了一种新的算法 Wellington，该算法考虑了这种链特异性信息的不平衡，从而有效地识别 DNA 足迹。与之前报道的方法相比，该算法通过减少假阳性的比例显著提高了特异性，并且需要更少的预测来重现等量的 ChIP-seq 数据。我们还提供了一个开源软件包 pyDNase，它实现了 Wellington 算法，以与 DNase-seq 数据接口并加速分析。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5be8/3834841/9b0a732cbe52/gkt850f1p.jpg

相似文献

Wellington: a novel method for the accurate identification of digital genomic footprints from DNase-seq data.惠灵顿：一种从 DNase-seq 数据中准确识别数字基因组足迹的新方法。

Nucleic Acids Res. 2013 Nov;41(21):e201. doi: 10.1093/nar/gkt850. Epub 2013 Sep 25.

BinDNase: a discriminatory approach for transcription factor binding prediction using DNase I hypersensitivity data.BinDNase：一种利用DNA酶I超敏反应数据进行转录因子结合预测的鉴别方法。

Bioinformatics. 2015 Sep 1;31(17):2852-9. doi: 10.1093/bioinformatics/btv294. Epub 2015 May 7.

Genomic Footprinting Analyses from DNase-seq Data to Construct Gene Regulatory Networks.从 DNase-seq 数据进行基因组足迹分析以构建基因调控网络。

Methods Mol Biol. 2021;2328:25-46. doi: 10.1007/978-1-0716-1534-8_3.

XL-DNase-seq: improved footprinting of dynamic transcription factors.XL-DNase-seq：动态转录因子足迹分析的改进。

Epigenetics Chromatin. 2019 Jun 4;12(1):30. doi: 10.1186/s13072-019-0277-6.

Explicit DNase sequence bias modeling enables high-resolution transcription factor footprint detection.明确的脱氧核糖核酸酶序列偏差建模可实现高分辨率转录因子足迹检测。

Nucleic Acids Res. 2014 Oct 29;42(19):11865-78. doi: 10.1093/nar/gku810. Epub 2014 Oct 7.

High-resolution mapping of in vivo genomic transcription factor binding sites using in situ DNase I footprinting and ChIP-seq.利用原位 DNase I 足迹法和 ChIP-seq 技术进行体内基因组转录因子结合位点的高分辨率作图。

DNA Res. 2013 Aug;20(4):325-38. doi: 10.1093/dnares/dst013. Epub 2013 Apr 11.

Analysis of computational footprinting methods for DNase sequencing experiments.计算足迹法在 DNA 测序实验中的分析。

Nat Methods. 2016 Apr;13(4):303-9. doi: 10.1038/nmeth.3772. Epub 2016 Feb 22.

Atlas of Transcription Factor Binding Sites from ENCODE DNase Hypersensitivity Data across 27 Tissue Types.ENCODE 转录因子结合位点图谱：来自 27 种组织类型的 DNA 酶超敏数据

Cell Rep. 2020 Aug 18;32(7):108029. doi: 10.1016/j.celrep.2020.108029.

TRACE: transcription factor footprinting using chromatin accessibility data and DNA sequence.TRACE：使用染色质可及性数据和 DNA 序列进行转录因子足迹分析。

Genome Res. 2020 Jul;30(7):1040-1046. doi: 10.1101/gr.258228.119. Epub 2020 Jul 6.

XL-DNase-Seq: Footprinting Analysis of Dynamic Transcription Factors.XL-DNase-Seq：动态转录因子的足迹分析。

Methods Mol Biol. 2024;2846:243-261. doi: 10.1007/978-1-0716-4071-5_15.

引用本文的文献

Comparative Profiling of Regulatory Modules as a Tool for Identifying the Transcription Factor Network Linked to Leukemogenesis.作为识别与白血病发生相关转录因子网络工具的调控模块比较分析

Methods Mol Biol. 2025;2909:179-209. doi: 10.1007/978-1-0716-4442-3_13.

ChromBPNet: bias factorized, base-resolution deep learning models of chromatin accessibility reveal cis-regulatory sequence syntax, transcription factor footprints and regulatory variants.ChromBPNet：染色质可及性的偏差分解、碱基分辨率深度学习模型揭示顺式调控序列语法、转录因子足迹和调控变异体

bioRxiv. 2025 Jan 8:2024.12.25.630221. doi: 10.1101/2024.12.25.630221.

A single-cell multimodal view on gene regulatory network inference from transcriptomics and chromatin accessibility data.单细胞多模态视角下从转录组学和染色质可及性数据推断基因调控网络。

Brief Bioinform. 2024 Jul 25;25(5). doi: 10.1093/bib/bbae382.

Characterization of the DNA accessibility of chloroplast genomes in grasses.研究草类叶绿体基因组 DNA 可及性的特征。

Commun Biol. 2024 Jun 22;7(1):760. doi: 10.1038/s42003-024-06374-4.

Uncovering uncharacterized binding of transcription factors from ATAC-seq footprinting data.从ATAC-seq足迹数据中发现转录因子的未表征结合。

Sci Rep. 2024 Apr 23;14(1):9275. doi: 10.1038/s41598-024-59989-2.

Chromatin accessibility profiling methods.染色质可及性分析方法。

Nat Rev Methods Primers. 2021;1. doi: 10.1038/s43586-020-00008-9. Epub 2021 Jan 21.

Integrative high-throughput enhancer surveying and functional verification divulges a YY2-condensed regulatory axis conferring risk for osteoporosis.综合高通量增强子调查和功能验证揭示了 YY2 凝聚的调控轴赋予骨质疏松症风险。

Cell Genom. 2024 Mar 13;4(3):100501. doi: 10.1016/j.xgen.2024.100501. Epub 2024 Feb 8.

Gene regulatory network analysis predicts cooperating transcription factor regulons required for FLT3-ITD+ AML growth.基因调控网络分析预测了 FLT3-ITD+AML 生长所需的协同转录因子调节子。

Cell Rep. 2023 Dec 26;42(12):113568. doi: 10.1016/j.celrep.2023.113568. Epub 2023 Dec 15.

Neuronal transcription of autism gene PTCHD1 is regulated by a conserved downstream enhancer sequence.自闭症基因 PTCHD1 的神经元转录受保守下游增强子序列调控。

Sci Rep. 2023 Nov 21;13(1):20391. doi: 10.1038/s41598-023-46673-0.

Transcriptional reprogramming by mutated IRF4 in lymphoma.淋巴瘤中突变型 IRF4 的转录重编程。

Nat Commun. 2023 Nov 7;14(1):6947. doi: 10.1038/s41467-023-41954-8.

本文引用的文献

Chromatin accessibility data sets show bias due to sequence specificity of the DNase I enzyme.染色质可及性数据集显示出由于 DNase I 酶的序列特异性而产生的偏差。

PLoS One. 2013 Jul 26;8(7):e69853. doi: 10.1371/journal.pone.0069853. Print 2013.

ENCODE data in the UCSC Genome Browser: year 5 update.在 UCSC 基因组浏览器中编码数据：第 5 年更新。

Nucleic Acids Res. 2013 Jan;41(Database issue):D56-63. doi: 10.1093/nar/gks1172. Epub 2012 Nov 27.

Current bioinformatic approaches to identify DNase I hypersensitive sites and genomic footprints from DNase-seq data.当前用于从DNase-seq数据中识别DNase I超敏位点和基因组足迹的生物信息学方法。

Front Genet. 2012 Oct 31;3:230. doi: 10.3389/fgene.2012.00230. eCollection 2012.

An expansive human regulatory lexicon encoded in transcription factor footprints.转录因子足迹中编码的广泛人类调控词汇。

Nature. 2012 Sep 6;489(7414):83-90. doi: 10.1038/nature11212.

An integrated encyclopedia of DNA elements in the human genome.人类基因组中 DNA 元件的综合百科全书。

Nature. 2012 Sep 6;489(7414):57-74. doi: 10.1038/nature11247.

Transcription factors: from enhancer binding to developmental control.转录因子：从增强子结合到发育控制。

Nat Rev Genet. 2012 Sep;13(9):613-26. doi: 10.1038/nrg3207. Epub 2012 Aug 7.

Waves of retrotransposon expansion remodel genome organization and CTCF binding in multiple mammalian lineages.反转录转座子的扩展浪潮重塑了多个哺乳动物谱系的基因组组织和 CTCF 结合。

Cell. 2012 Jan 20;148(1-2):335-48. doi: 10.1016/j.cell.2011.11.058. Epub 2012 Jan 12.

Comprehensive genome-wide protein-DNA interactions detected at single-nucleotide resolution.在单核苷酸分辨率下检测到全基因组范围内的蛋白质-DNA 相互作用。

Cell. 2011 Dec 9;147(6):1408-19. doi: 10.1016/j.cell.2011.11.013.

Dynamic exchange at regulatory elements during chromatin remodeling underlies assisted loading mechanism.动态交换在染色质重塑过程中的调控元件，为辅助加载机制提供了基础。

Cell. 2011 Aug 19;146(4):544-54. doi: 10.1016/j.cell.2011.07.006. Epub 2011 Aug 11.

Structure and function of active chromatin and DNase I hypersensitive sites.活性染色质和 DNase I 超敏位点的结构和功能。

FEBS J. 2011 Jul;278(13):2182-210. doi: 10.1111/j.1742-4658.2011.08128.x. Epub 2011 May 26.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

惠灵顿：一种从 DNase-seq 数据中准确识别数字基因组足迹的新方法。

Wellington: a novel method for the accurate identification of digital genomic footprints from DNase-seq data.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献