• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过词频相似性识别顺式调控序列。

Identifying cis-regulatory sequences by word profile similarity.

机构信息

University of California Berkeley and University of California San Francisco Joint Graduate Group in Bioengineering, University of California, Berkeley, California, United States of America.

出版信息

PLoS One. 2009 Sep 4;4(9):e6901. doi: 10.1371/journal.pone.0006901.

DOI:10.1371/journal.pone.0006901
PMID:19730735
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2731932/
Abstract

BACKGROUND

Recognizing regulatory sequences in genomes is a continuing challenge, despite a wealth of available genomic data and a growing number of experimentally validated examples.

METHODOLOGY/PRINCIPAL FINDINGS: We discuss here a simple approach to search for regulatory sequences based on the compositional similarity of genomic regions and known cis-regulatory sequences. This method, which is not limited to searching for predefined motifs, recovers sequences known to be under similar regulatory control. The words shared by the recovered sequences often correspond to known binding sites. Furthermore, we show that although local word profile clustering is predictive for the regulatory sequences involved in blastoderm segmentation, local dissimilarity is a more universal feature of known regulatory sequences in Drosophila.

CONCLUSIONS/SIGNIFICANCE: Our method leverages sequence motifs within a known regulatory sequence to identify co-regulated sequences without explicitly defining binding sites. We also show that regulatory sequences can be distinguished from surrounding sequences by local sequence dissimilarity, a novel feature in identifying regulatory sequences across a genome. Source code for WPH-finder is available for download at http://rana.lbl.gov/downloads/wph.tar.gz.

摘要

背景

尽管有大量可用的基因组数据和越来越多经过实验验证的例子,识别基因组中的调控序列仍然是一个持续的挑战。

方法/主要发现:我们在这里讨论了一种简单的方法,基于基因组区域和已知顺式调控序列的组成相似性来搜索调控序列。这种方法不仅限于搜索预定义的基序,还可以恢复已知受相似调控控制的序列。恢复的序列中共享的词通常对应于已知的结合位点。此外,我们表明,虽然局部词谱聚类对涉及胚胎分割的调控序列具有预测性,但在果蝇中,局部不相似性是已知调控序列的更普遍特征。

结论/意义:我们的方法利用已知调控序列中的序列基序来识别共同调控的序列,而无需显式定义结合位点。我们还表明,通过局部序列不相似性可以区分调控序列和周围序列,这是在整个基因组中识别调控序列的一个新特征。WPH-finder 的源代码可在 http://rana.lbl.gov/downloads/wph.tar.gz 下载。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e36d/2731932/4f980734d113/pone.0006901.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e36d/2731932/75dcecc1da2d/pone.0006901.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e36d/2731932/3f16339d2db3/pone.0006901.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e36d/2731932/731c22d066db/pone.0006901.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e36d/2731932/dafc0cae09d9/pone.0006901.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e36d/2731932/9e47e92543f6/pone.0006901.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e36d/2731932/03d5293790ab/pone.0006901.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e36d/2731932/bf408d5206f4/pone.0006901.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e36d/2731932/039b08325eaa/pone.0006901.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e36d/2731932/4f980734d113/pone.0006901.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e36d/2731932/75dcecc1da2d/pone.0006901.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e36d/2731932/3f16339d2db3/pone.0006901.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e36d/2731932/731c22d066db/pone.0006901.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e36d/2731932/dafc0cae09d9/pone.0006901.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e36d/2731932/9e47e92543f6/pone.0006901.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e36d/2731932/03d5293790ab/pone.0006901.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e36d/2731932/bf408d5206f4/pone.0006901.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e36d/2731932/039b08325eaa/pone.0006901.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e36d/2731932/4f980734d113/pone.0006901.g009.jpg

相似文献

1
Identifying cis-regulatory sequences by word profile similarity.通过词频相似性识别顺式调控序列。
PLoS One. 2009 Sep 4;4(9):e6901. doi: 10.1371/journal.pone.0006901.
2
A multistep bioinformatic approach detects putative regulatory elements in gene promoters.一种多步骤生物信息学方法可检测基因启动子中的假定调控元件。
BMC Bioinformatics. 2005 May 18;6:121. doi: 10.1186/1471-2105-6-121.
3
Computational detection of genomic cis-regulatory modules applied to body patterning in the early Drosophila embryo.应用于早期果蝇胚胎身体模式形成的基因组顺式调控模块的计算检测。
BMC Bioinformatics. 2002 Oct 24;3:30. doi: 10.1186/1471-2105-3-30.
4
DISCOVER: a feature-based discriminative method for motif search in complex genomes.DISCOVER:一种用于在复杂基因组中进行基序搜索的基于特征的判别方法。
Bioinformatics. 2009 Jun 15;25(12):i321-9. doi: 10.1093/bioinformatics/btp230.
5
Computation-based discovery of related transcriptional regulatory modules and motifs using an experimentally validated combinatorial model.使用经过实验验证的组合模型基于计算发现相关转录调控模块和基序。
Genome Res. 2002 Jul;12(7):1019-28. doi: 10.1101/gr.228902.
6
Bioinformatics for the 'bench biologist': how to find regulatory regions in genomic DNA.面向“实验生物学家”的生物信息学:如何在基因组DNA中找到调控区域。
Nat Immunol. 2004 Aug;5(8):768-74. doi: 10.1038/ni0804-768.
7
Some statistical properties of regulatory DNA sequences, and their use in predicting regulatory regions in the Drosophila genome: the fluffy-tail test.调控DNA序列的一些统计特性及其在预测果蝇基因组调控区域中的应用:蓬松尾检验
BMC Bioinformatics. 2005 Apr 27;6:109. doi: 10.1186/1471-2105-6-109.
8
Prediction of similarly acting cis-regulatory modules by subsequence profiling and comparative genomics in Drosophila melanogaster and D.pseudoobscura.通过序列分析和比较基因组学预测黑腹果蝇和拟暗果蝇中具有相似作用的顺式调控模块
Bioinformatics. 2004 Nov 1;20(16):2738-50. doi: 10.1093/bioinformatics/bth320. Epub 2004 May 14.
9
Detection and visualization of compositionally similar cis-regulatory element clusters in orthologous and coordinately controlled genes.直系同源且协同调控基因中组成相似的顺式调控元件簇的检测与可视化
Genome Res. 2002 Sep;12(9):1408-17. doi: 10.1101/gr.255002.
10
Predictive screening for regulators of conserved functional gene modules (gene batteries) in mammals.哺乳动物中保守功能基因模块(基因簇)调控因子的预测性筛选。
BMC Genomics. 2005 May 9;6:68. doi: 10.1186/1471-2164-6-68.

引用本文的文献

1
The Resolved Mutual Information Function as a Structural Fingerprint of Biomolecular Sequences for Interpretable Machine Learning Classifiers.作为可解释机器学习分类器的生物分子序列结构指纹的解析互信息函数
Entropy (Basel). 2021 Oct 17;23(10):1357. doi: 10.3390/e23101357.
2
Alignment-free method for DNA sequence clustering using Fuzzy integral similarity.基于模糊积分相似度的无比对 DNA 序列聚类方法。
Sci Rep. 2019 Mar 6;9(1):3753. doi: 10.1038/s41598-019-40452-6.
3
CAFE: aCcelerated Alignment-FrEe sequence analysis.CAFE:加速无比对序列分析。

本文引用的文献

1
Sepsid even-skipped enhancers are functionally conserved in Drosophila despite lack of sequence conservation.脓毒症偶数跳过增强子在果蝇中功能保守,尽管缺乏序列保守性。
PLoS Genet. 2008 Jun 27;4(6):e1000106. doi: 10.1371/journal.pgen.1000106.
2
Transcription factors bind thousands of active and inactive regions in the Drosophila blastoderm.转录因子与果蝇胚盘内数千个活跃和非活跃区域结合。
PLoS Biol. 2008 Feb;6(2):e27. doi: 10.1371/journal.pbio.0060027.
3
Computational discovery of cis-regulatory modules in Drosophila without prior knowledge of motifs.
Nucleic Acids Res. 2017 Jul 3;45(W1):W554-W559. doi: 10.1093/nar/gkx351.
4
Fast and accurate phylogeny reconstruction using filtered spaced-word matches.使用过滤后的间隔词匹配进行快速准确的系统发育重建。
Bioinformatics. 2017 Apr 1;33(7):971-979. doi: 10.1093/bioinformatics/btw776.
5
Progress and challenges in bioinformatics approaches for enhancer identification.增强子识别的生物信息学方法的进展与挑战
Brief Bioinform. 2016 Nov;17(6):967-979. doi: 10.1093/bib/bbv101. Epub 2015 Dec 3.
6
Estimating evolutionary distances between genomic sequences from spaced-word matches.通过间隔词匹配估计基因组序列之间的进化距离。
Algorithms Mol Biol. 2015 Feb 11;10:5. doi: 10.1186/s13015-015-0032-x. eCollection 2015.
7
Evidence for deep regulatory similarities in early developmental programs across highly diverged insects.在高度分化的昆虫早期发育程序中存在深度调控相似性的证据。
Genome Biol Evol. 2014 Sep;6(9):2301-20. doi: 10.1093/gbe/evu184.
8
Imogene: identification of motifs and cis-regulatory modules underlying gene co-regulation.伊莫金:基因共调控背后的基序和顺式调控模块的鉴定。
Nucleic Acids Res. 2014 Jun;42(10):6128-45. doi: 10.1093/nar/gku209. Epub 2014 Mar 25.
9
Computational identification of active enhancers in model organisms.计算鉴定模式生物中的活性增强子。
Genomics Proteomics Bioinformatics. 2013 Jun;11(3):142-50. doi: 10.1016/j.gpb.2013.04.002. Epub 2013 May 17.
10
Alignment-free sequence comparison based on next-generation sequencing reads.基于新一代测序读数的无比对序列比较。
J Comput Biol. 2013 Feb;20(2):64-79. doi: 10.1089/cmb.2012.0228.
在没有先验知识的情况下,在果蝇中进行顺式调控模块的计算发现。
Genome Biol. 2008 Jan 28;9(1):R22. doi: 10.1186/gb-2008-9-1-r22.
4
A statistical method for alignment-free comparison of regulatory sequences.一种用于调控序列无比对比较的统计方法。
Bioinformatics. 2007 Jul 1;23(13):i249-55. doi: 10.1093/bioinformatics/btm211.
5
Large-scale analysis of transcriptional cis-regulatory modules reveals both common features and distinct subclasses.转录顺式调控模块的大规模分析揭示了共同特征和不同的亚类。
Genome Biol. 2007;8(6):R101. doi: 10.1186/gb-2007-8-6-r101.
6
Genome-wide mapping of in vivo protein-DNA interactions.体内蛋白质-DNA相互作用的全基因组图谱绘制。
Science. 2007 Jun 8;316(5830):1497-502. doi: 10.1126/science.1141319. Epub 2007 May 31.
7
Discovering transcriptional regulatory regions in Drosophila by a nonalignment method for phylogenetic footprinting.通过一种用于系统发育足迹分析的非比对方法在果蝇中发现转录调控区域。
Proc Natl Acad Sci U S A. 2007 Apr 10;104(15):6305-10. doi: 10.1073/pnas.0701614104. Epub 2007 Mar 29.
8
Identifying cis-regulatory modules by combining comparative and compositional analysis of DNA.通过结合DNA的比较分析和组成分析来识别顺式调控模块。
Bioinformatics. 2006 Dec 1;22(23):2858-64. doi: 10.1093/bioinformatics/btl499. Epub 2006 Oct 10.
9
(Re)modeling the transcriptional enhancer.(重新)构建转录增强子
Nat Genet. 2006 Oct;38(10):1102-3. doi: 10.1038/ng1006-1102.
10
Quantitative and predictive model of transcriptional control of the Drosophila melanogaster even skipped gene.果蝇even skipped基因转录调控的定量和预测模型。
Nat Genet. 2006 Oct;38(10):1159-65. doi: 10.1038/ng1886. Epub 2006 Sep 17.