• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种独立于基序的 DNA 序列特异性度量方法。

A motif-independent metric for DNA sequence specificity.

机构信息

Department of Biostatistics, Harvard School of Public Health, 677 Huntington Avenue, Boston, MA 02115, USA.

出版信息

BMC Bioinformatics. 2011 Oct 21;12:408. doi: 10.1186/1471-2105-12-408.

DOI:10.1186/1471-2105-12-408
PMID:22017798
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3267244/
Abstract

BACKGROUND

Genome-wide mapping of protein-DNA interactions has been widely used to investigate biological functions of the genome. An important question is to what extent such interactions are regulated at the DNA sequence level. However, current investigation is hampered by the lack of computational methods for systematic evaluating sequence specificity.

RESULTS

We present a simple, unbiased quantitative measure for DNA sequence specificity called the Motif Independent Measure (MIM). By analyzing both simulated and real experimental data, we found that the MIM measure can be used to detect sequence specificity independent of presence of transcription factor (TF) binding motifs. We also found that the level of specificity associated with H3K4me1 target sequences is highly cell-type specific and highest in embryonic stem (ES) cells. We predicted H3K4me1 target sequences by using the N- score model and found that the prediction accuracy is indeed high in ES cells.The software to compute the MIM is freely available at: https://github.com/lucapinello/mim.

CONCLUSIONS

Our method provides a unified framework for quantifying DNA sequence specificity and serves as a guide for development of sequence-based prediction models.

摘要

背景

全基因组范围内的蛋白质-DNA 相互作用图谱已被广泛用于研究基因组的生物学功能。一个重要的问题是,这种相互作用在多大程度上受到 DNA 序列水平的调控。然而,目前的研究受到缺乏系统评估序列特异性的计算方法的阻碍。

结果

我们提出了一种简单、无偏的称为 motif 独立度量(MIM)的 DNA 序列特异性定量度量方法。通过分析模拟和真实实验数据,我们发现 MIM 度量可用于检测与转录因子(TF)结合基序无关的序列特异性。我们还发现,与 H3K4me1 靶序列相关的特异性水平在细胞类型特异性中非常高,在胚胎干细胞(ES)中最高。我们使用 N-得分模型预测了 H3K4me1 靶序列,并发现该模型在 ES 细胞中的预测准确性确实很高。计算 MIM 的软件可在以下网址免费获取:https://github.com/lucapinello/mim。

结论

我们的方法为量化 DNA 序列特异性提供了一个统一的框架,并为基于序列的预测模型的开发提供了指导。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/621d/3267244/bf17b88ef0b8/1471-2105-12-408-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/621d/3267244/dbc552192de1/1471-2105-12-408-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/621d/3267244/f2739ddd458c/1471-2105-12-408-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/621d/3267244/d8883fa5a2d6/1471-2105-12-408-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/621d/3267244/bf17b88ef0b8/1471-2105-12-408-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/621d/3267244/dbc552192de1/1471-2105-12-408-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/621d/3267244/f2739ddd458c/1471-2105-12-408-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/621d/3267244/d8883fa5a2d6/1471-2105-12-408-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/621d/3267244/bf17b88ef0b8/1471-2105-12-408-4.jpg

相似文献

1
A motif-independent metric for DNA sequence specificity.一种独立于基序的 DNA 序列特异性度量方法。
BMC Bioinformatics. 2011 Oct 21;12:408. doi: 10.1186/1471-2105-12-408.
2
A biophysical model for analysis of transcription factor interaction and binding site arrangement from genome-wide binding data.基于全基因组结合数据的转录因子相互作用和结合位点排列的生物物理模型分析。
PLoS One. 2009 Dec 1;4(12):e8155. doi: 10.1371/journal.pone.0008155.
3
Nonconsensus Protein Binding to Repetitive DNA Sequence Elements Significantly Affects Eukaryotic Genomes.与重复DNA序列元件的非一致性蛋白质结合显著影响真核生物基因组。
PLoS Comput Biol. 2015 Aug 18;11(8):e1004429. doi: 10.1371/journal.pcbi.1004429. eCollection 2015 Aug.
4
An algorithm for finding protein-DNA binding sites with applications to chromatin-immunoprecipitation microarray experiments.一种用于寻找蛋白质-DNA结合位点的算法及其在染色质免疫沉淀微阵列实验中的应用。
Nat Biotechnol. 2002 Aug;20(8):835-9. doi: 10.1038/nbt717. Epub 2002 Jul 8.
5
Identification of context-dependent motifs by contrasting ChIP binding data.通过对比 ChIP 结合数据鉴定上下文相关基序。
Bioinformatics. 2010 Nov 15;26(22):2826-32. doi: 10.1093/bioinformatics/btq546. Epub 2010 Sep 23.
6
DNA sequence models of genome-wide Drosophila melanogaster Polycomb binding sites improve generalization to independent Polycomb Response Elements.全基因组果蝇 Polycomb 结合位点的 DNA 序列模型提高了对独立 Polycomb 反应元件的泛化能力。
Nucleic Acids Res. 2019 Sep 5;47(15):7781-7797. doi: 10.1093/nar/gkz617.
7
Use model-based Analysis of ChIP-Seq (MACS) to analyze short reads generated by sequencing protein-DNA interactions in embryonic stem cells.使用基于模型的ChIP-Seq分析方法(MACS)来分析通过对胚胎干细胞中蛋白质-DNA相互作用进行测序而产生的短序列 reads。
Methods Mol Biol. 2014;1150:81-95. doi: 10.1007/978-1-4939-0512-6_4.
8
CacPred: a cascaded convolutional neural network for TF-DNA binding prediction.CacPred:用于转录因子-脱氧核糖核酸结合预测的级联卷积神经网络
BMC Genomics. 2025 Mar 18;26(Suppl 2):264. doi: 10.1186/s12864-025-11399-y.
9
DIVERSITY in binding, regulation, and evolution revealed from high-throughput ChIP.高通量 ChIP 揭示的结合、调控和进化多样性。
PLoS Comput Biol. 2018 Apr 23;14(4):e1006090. doi: 10.1371/journal.pcbi.1006090. eCollection 2018 Apr.
10
Prediction of TF target sites based on atomistic models of protein-DNA complexes.基于蛋白质-DNA复合物原子模型预测转录因子靶位点。
BMC Bioinformatics. 2008 Oct 16;9:436. doi: 10.1186/1471-2105-9-436.

引用本文的文献

1
Analyzing big datasets of genomic sequences: fast and scalable collection of k-mer statistics.分析基因组序列的大数据集:快速可扩展的 k-mer 统计信息收集。
BMC Bioinformatics. 2019 Apr 18;20(Suppl 4):138. doi: 10.1186/s12859-019-2694-8.
2
Deep learning architectures for prediction of nucleosome positioning from sequences data.深度学习架构用于从序列数据预测核小体定位。
BMC Bioinformatics. 2018 Nov 20;19(Suppl 14):418. doi: 10.1186/s12859-018-2386-9.
3
Deep learning models for bacteria taxonomic classification of metagenomic data.

本文引用的文献

1
FIMO: scanning for occurrences of a given motif.FIMO:扫描给定基序的出现情况。
Bioinformatics. 2011 Apr 1;27(7):1017-8. doi: 10.1093/bioinformatics/btr064. Epub 2011 Feb 16.
2
Mutually positive regulatory feedback loop between interferons and estrogen receptor-alpha in mice: implications for sex bias in autoimmunity.干扰素和雌激素受体-α在小鼠中相互正向调节反馈环:对自身免疫性别偏向的影响。
PLoS One. 2010 May 28;5(5):e10868. doi: 10.1371/journal.pone.0010868.
3
Genome-wide discovery of human heart enhancers.人类心脏增强子的全基因组发现。
基于深度学习的宏基因组数据细菌分类学分类模型
BMC Bioinformatics. 2018 Jul 9;19(Suppl 7):198. doi: 10.1186/s12859-018-2182-6.
4
Analysis of chromatin-state plasticity identifies cell-type-specific regulators of H3K27me3 patterns.分析染色质状态的可塑性,确定 H3K27me3 模式的细胞类型特异性调控因子。
Proc Natl Acad Sci U S A. 2014 Jan 21;111(3):E344-53. doi: 10.1073/pnas.1322570111. Epub 2014 Jan 6.
5
Applications of alignment-free methods in epigenomics.无比对方法在表观基因组学中的应用。
Brief Bioinform. 2014 May;15(3):419-30. doi: 10.1093/bib/bbt078. Epub 2013 Nov 6.
6
Linking genome to epigenome.将基因组与表观基因组联系起来。
Wiley Interdiscip Rev Syst Biol Med. 2012 May-Jun;4(3):297-309. doi: 10.1002/wsbm.1165. Epub 2012 Feb 17.
Genome Res. 2010 Mar;20(3):381-92. doi: 10.1101/gr.098657.109. Epub 2010 Jan 14.
4
G+C content dominates intrinsic nucleosome occupancy.G+C 含量主导固有核小体占有率。
BMC Bioinformatics. 2009 Dec 22;10:442. doi: 10.1186/1471-2105-10-442.
5
Profiling the human protein-DNA interactome reveals ERK2 as a transcriptional repressor of interferon signaling.对人类蛋白质-DNA相互作用组进行分析揭示了ERK2作为干扰素信号转导的转录抑制因子。
Cell. 2009 Oct 30;139(3):610-22. doi: 10.1016/j.cell.2009.08.037.
6
Histone modifications at human enhancers reflect global cell-type-specific gene expression.人类增强子上的组蛋白修饰反映了整体细胞类型特异性基因表达。
Nature. 2009 May 7;459(7243):108-12. doi: 10.1038/nature07829. Epub 2009 Mar 18.
7
Nucleosome positioning and gene regulation: advances through genomics.核小体定位与基因调控:基因组学的进展
Nat Rev Genet. 2009 Mar;10(3):161-72. doi: 10.1038/nrg2522.
8
Targeted recruitment of histone modifications in humans predicted by genomic sequences.基因组序列预测人类中组蛋白修饰的靶向募集。
J Comput Biol. 2009 Feb;16(2):341-55. doi: 10.1089/cmb.2008.18TT.
9
GOrilla: a tool for discovery and visualization of enriched GO terms in ranked gene lists.GOrilla:一种用于在排序后的基因列表中发现和可视化富集的基因本体(GO)术语的工具。
BMC Bioinformatics. 2009 Feb 3;10:48. doi: 10.1186/1471-2105-10-48.
10
Chromatin signatures in multipotent human hematopoietic stem cells indicate the fate of bivalent genes during differentiation.多能人类造血干细胞中的染色质特征表明了二价基因在分化过程中的命运。
Cell Stem Cell. 2009 Jan 9;4(1):80-93. doi: 10.1016/j.stem.2008.11.011.