• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用判别目标函数和动态搜索空间识别预测性顺式调控元件。

Identification of Predictive Cis-Regulatory Elements Using a Discriminative Objective Function and a Dynamic Search Space.

作者信息

Karnik Rahul, Beer Michael A

机构信息

Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, United States of America.

Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, United States of America; McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University, Baltimore, MD, United States of America.

出版信息

PLoS One. 2015 Oct 14;10(10):e0140557. doi: 10.1371/journal.pone.0140557. eCollection 2015.

DOI:10.1371/journal.pone.0140557
PMID:26465884
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4605740/
Abstract

The generation of genomic binding or accessibility data from massively parallel sequencing technologies such as ChIP-seq and DNase-seq continues to accelerate. Yet state-of-the-art computational approaches for the identification of DNA binding motifs often yield motifs of weak predictive power. Here we present a novel computational algorithm called MotifSpec, designed to find predictive motifs, in contrast to over-represented sequence elements. The key distinguishing feature of this algorithm is that it uses a dynamic search space and a learned threshold to find discriminative motifs in combination with the modeling of motifs using a full PWM (position weight matrix) rather than k-mer words or regular expressions. We demonstrate that our approach finds motifs corresponding to known binding specificities in several mammalian ChIP-seq datasets, and that our PWMs classify the ChIP-seq signals with accuracy comparable to, or marginally better than motifs from the best existing algorithms. In other datasets, our algorithm identifies novel motifs where other methods fail. Finally, we apply this algorithm to detect motifs from expression datasets in C. elegans using a dynamic expression similarity metric rather than fixed expression clusters, and find novel predictive motifs.

摘要

通过ChIP-seq和DNase-seq等大规模平行测序技术生成基因组结合或可及性数据的速度持续加快。然而,用于识别DNA结合基序的最先进计算方法往往产生预测能力较弱的基序。在此,我们提出一种名为MotifSpec的新型计算算法,旨在找到具有预测性的基序,而非过度呈现的序列元件。该算法的关键区别特征在于,它使用动态搜索空间和学习到的阈值来寻找具有判别性的基序,并结合使用完整的位置权重矩阵(PWM)对基序进行建模,而不是使用k-mer词或正则表达式。我们证明,我们的方法在几个哺乳动物ChIP-seq数据集中找到了与已知结合特异性相对应的基序,并且我们的PWM对ChIP-seq信号进行分类的准确性与现有最佳算法的基序相当,或略胜一筹。在其他数据集中,我们的算法识别出了其他方法未能发现的新基序。最后,我们应用该算法,使用动态表达相似性度量而非固定表达簇来检测秀丽隐杆线虫表达数据集中的基序,并发现了新的预测基序。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89fa/4605740/c26ca69ae097/pone.0140557.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89fa/4605740/258fd8b6808c/pone.0140557.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89fa/4605740/6c22fd50d823/pone.0140557.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89fa/4605740/1c847f72b531/pone.0140557.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89fa/4605740/60decac2169f/pone.0140557.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89fa/4605740/42618a7b7057/pone.0140557.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89fa/4605740/debf364b3e6a/pone.0140557.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89fa/4605740/8e8723b73206/pone.0140557.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89fa/4605740/c26ca69ae097/pone.0140557.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89fa/4605740/258fd8b6808c/pone.0140557.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89fa/4605740/6c22fd50d823/pone.0140557.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89fa/4605740/1c847f72b531/pone.0140557.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89fa/4605740/60decac2169f/pone.0140557.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89fa/4605740/42618a7b7057/pone.0140557.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89fa/4605740/debf364b3e6a/pone.0140557.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89fa/4605740/8e8723b73206/pone.0140557.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89fa/4605740/c26ca69ae097/pone.0140557.g008.jpg

相似文献

1
Identification of Predictive Cis-Regulatory Elements Using a Discriminative Objective Function and a Dynamic Search Space.使用判别目标函数和动态搜索空间识别预测性顺式调控元件。
PLoS One. 2015 Oct 14;10(10):e0140557. doi: 10.1371/journal.pone.0140557. eCollection 2015.
2
Simultaneously learning DNA motif along with its position and sequence rank preferences through expectation maximization algorithm.通过期望最大化算法同时学习DNA基序及其位置和序列排名偏好。
J Comput Biol. 2013 Mar;20(3):237-48. doi: 10.1089/cmb.2012.0233.
3
FisherMP: fully parallel algorithm for detecting combinatorial motifs from large ChIP-seq datasets.FisherMP:一种用于从大型 ChIP-seq 数据集中检测组合基序的完全并行算法。
DNA Res. 2019 Jun 1;26(3):231-242. doi: 10.1093/dnares/dsz004.
4
De novo prediction of cis-regulatory elements and modules through integrative analysis of a large number of ChIP datasets.通过对大量染色质免疫沉淀数据集进行综合分析,从头预测顺式调控元件和模块。
BMC Genomics. 2014 Dec 2;15:1047. doi: 10.1186/1471-2164-15-1047.
5
From binding motifs in ChIP-Seq data to improved models of transcription factor binding sites.从染色质免疫沉淀测序(ChIP-Seq)数据中的结合基序到转录因子结合位点的改进模型
J Bioinform Comput Biol. 2013 Feb;11(1):1340004. doi: 10.1142/S0219720013400040. Epub 2013 Jan 16.
6
A single ChIP-seq dataset is sufficient for comprehensive analysis of motifs co-occurrence with MCOT package.单个 ChIP-seq 数据集足以使用 MCOT 包全面分析与 MOTF 共现的情况。
Nucleic Acids Res. 2019 Dec 2;47(21):e139. doi: 10.1093/nar/gkz800.
7
On counting position weight matrix matches in a sequence, with application to discriminative motif finding.关于计算序列中的位置权重矩阵匹配及其在判别性基序发现中的应用。
Bioinformatics. 2006 Jul 15;22(14):e454-63. doi: 10.1093/bioinformatics/btl227.
8
Tree-based position weight matrix approach to model transcription factor binding site profiles.基于树的位置权重矩阵方法来模拟转录因子结合位点图谱。
PLoS One. 2011;6(9):e24210. doi: 10.1371/journal.pone.0024210. Epub 2011 Sep 2.
9
WSMD: weakly-supervised motif discovery in transcription factor ChIP-seq data.WSMD:在转录因子 ChIP-seq 数据中进行弱监督基序发现。
Sci Rep. 2017 Jun 12;7(1):3217. doi: 10.1038/s41598-017-03554-7.
10
Inferring intra-motif dependencies of DNA binding sites from ChIP-seq data.从ChIP-seq数据推断DNA结合位点的基序内依赖性。
BMC Bioinformatics. 2015 Nov 9;16:375. doi: 10.1186/s12859-015-0797-4.

引用本文的文献

1
Genome-scale screens identify JNK-JUN signaling as a barrier for pluripotency exit and endoderm differentiation.基因组规模筛选鉴定出 JNK-JUN 信号作为多能性退出和内胚层分化的障碍。
Nat Genet. 2019 Jun;51(6):999-1010. doi: 10.1038/s41588-019-0408-9. Epub 2019 May 20.
2
Epigenomic landscapes of retinal rods and cones.视网膜视杆细胞和视锥细胞的表观基因组图谱。
Elife. 2016 Mar 7;5:e11613. doi: 10.7554/eLife.11613.

本文引用的文献

1
A method to predict the impact of regulatory variants from DNA sequence.一种从DNA序列预测调控变异影响的方法。
Nat Genet. 2015 Aug;47(8):955-61. doi: 10.1038/ng.3331. Epub 2015 Jun 15.
2
Enhanced transcriptome maps from multiple mouse tissues reveal evolutionary constraint in gene expression.来自多个小鼠组织的增强转录组图谱揭示了基因表达中的进化限制。
Nat Commun. 2015 Jan 13;6:5903. doi: 10.1038/ncomms6903.
3
Comparison of the transcriptional landscapes between human and mouse tissues.人类和小鼠组织之间转录图谱的比较。
Proc Natl Acad Sci U S A. 2014 Dec 2;111(48):17224-9. doi: 10.1073/pnas.1413624111. Epub 2014 Nov 20.
4
A comparative encyclopedia of DNA elements in the mouse genome.小鼠基因组中DNA元件的比较百科全书。
Nature. 2014 Nov 20;515(7527):355-64. doi: 10.1038/nature13992.
5
Enhanced regulatory sequence prediction using gapped k-mer features.使用带缺口的 k-mer 特征增强调控序列预测。
PLoS Comput Biol. 2014 Jul 17;10(7):e1003711. doi: 10.1371/journal.pcbi.1003711. eCollection 2014 Jul.
6
Discriminative motif optimization based on perceptron training.基于感知机训练的判别模式优化。
Bioinformatics. 2014 Apr 1;30(7):941-8. doi: 10.1093/bioinformatics/btt748. Epub 2013 Dec 24.
7
Discriminative motif analysis of high-throughput dataset.高通量数据集的判别基序分析。
Bioinformatics. 2014 Mar 15;30(6):775-83. doi: 10.1093/bioinformatics/btt615. Epub 2013 Oct 25.
8
A general approach for discriminative de novo motif discovery from high-throughput data.一种从高通量数据中进行判别式从头发现基序的通用方法。
Nucleic Acids Res. 2013 Nov;41(21):e197. doi: 10.1093/nar/gkt831. Epub 2013 Sep 20.
9
kmer-SVM: a web server for identifying predictive regulatory sequence features in genomic data sets.kmer-SVM:一个用于在基因组数据集识别预测性调控序列特征的网络服务器。
Nucleic Acids Res. 2013 Jul;41(Web Server issue):W544-56. doi: 10.1093/nar/gkt519. Epub 2013 Jun 14.
10
The limits of de novo DNA motif discovery.从头开始的 DNA 基序发现的局限性。
PLoS One. 2012;7(11):e47836. doi: 10.1371/journal.pone.0047836. Epub 2012 Nov 7.