• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用局部词频的详尽评估对果蝇顺式调控模块进行统计提取。

Statistical extraction of Drosophila cis-regulatory modules using exhaustive assessment of local word frequency.

作者信息

Nazina Anna G, Papatsenko Dmitri A

机构信息

Department of Biology, New York University, New York, USA.

出版信息

BMC Bioinformatics. 2003 Dec 22;4:65. doi: 10.1186/1471-2105-4-65.

DOI:10.1186/1471-2105-4-65
PMID:14690551
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC341902/
Abstract

BACKGROUND

Transcription regulatory regions in higher eukaryotes are often represented by cis-regulatory modules (CRM) and are responsible for the formation of specific spatial and temporal gene expression patterns. These extended, approximately 1 KB, regions are found far from coding sequences and cannot be extracted from genome on the basis of their relative position to the coding regions.

RESULTS

To explore the feasibility of CRM extraction from a genome, we generated an original training set, containing annotated sequence data for most of the known developmental CRMs from Drosophila. Based on this set of experimental data, we developed a strategy for statistical extraction of cis-regulatory modules from the genome, using exhaustive analysis of local word frequency (LWF). To assess the performance of our analysis, we measured the correlation between predictions generated by the LWF algorithm and the distribution of conserved non-coding regions in a number of Drosophila developmental genes.

CONCLUSIONS

In most of the cases tested, we observed high correlation (up to 0.6-0.8, measured on the entire gene locus) between the two independent techniques. We discuss computational strategies available for extraction of Drosophila CRMs and possible extensions of these methods.

摘要

背景

高等真核生物中的转录调控区域通常由顺式调控模块(CRM)表示,并负责特定时空基因表达模式的形成。这些长度约为1千碱基对的扩展区域位于远离编码序列的位置,无法根据其与编码区域的相对位置从基因组中提取。

结果

为了探索从基因组中提取CRM的可行性,我们生成了一个原始训练集,其中包含来自果蝇的大多数已知发育CRM的注释序列数据。基于这组实验数据,我们开发了一种利用局部词频(LWF)详尽分析从基因组中统计提取顺式调控模块的策略。为了评估我们分析的性能,我们测量了LWF算法生成的预测与一些果蝇发育基因中保守非编码区域分布之间的相关性。

结论

在大多数测试案例中,我们观察到这两种独立技术之间具有高度相关性(在整个基因座上测量,高达0.6 - 0.8)。我们讨论了可用于提取果蝇CRM的计算策略以及这些方法可能的扩展。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9bd4/341902/ee9c72a20249/1471-2105-4-65-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9bd4/341902/135219a3d2de/1471-2105-4-65-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9bd4/341902/b686ca417ae5/1471-2105-4-65-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9bd4/341902/69d4fc7aecb6/1471-2105-4-65-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9bd4/341902/7a0a75b5e9c7/1471-2105-4-65-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9bd4/341902/2a10f7411f36/1471-2105-4-65-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9bd4/341902/ee9c72a20249/1471-2105-4-65-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9bd4/341902/135219a3d2de/1471-2105-4-65-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9bd4/341902/b686ca417ae5/1471-2105-4-65-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9bd4/341902/69d4fc7aecb6/1471-2105-4-65-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9bd4/341902/7a0a75b5e9c7/1471-2105-4-65-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9bd4/341902/2a10f7411f36/1471-2105-4-65-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9bd4/341902/ee9c72a20249/1471-2105-4-65-6.jpg

相似文献

1
Statistical extraction of Drosophila cis-regulatory modules using exhaustive assessment of local word frequency.利用局部词频的详尽评估对果蝇顺式调控模块进行统计提取。
BMC Bioinformatics. 2003 Dec 22;4:65. doi: 10.1186/1471-2105-4-65.
2
Prediction of similarly acting cis-regulatory modules by subsequence profiling and comparative genomics in Drosophila melanogaster and D.pseudoobscura.通过序列分析和比较基因组学预测黑腹果蝇和拟暗果蝇中具有相似作用的顺式调控模块
Bioinformatics. 2004 Nov 1;20(16):2738-50. doi: 10.1093/bioinformatics/bth320. Epub 2004 May 14.
3
De novo prediction of cis-regulatory elements and modules through integrative analysis of a large number of ChIP datasets.通过对大量染色质免疫沉淀数据集进行综合分析,从头预测顺式调控元件和模块。
BMC Genomics. 2014 Dec 2;15:1047. doi: 10.1186/1471-2164-15-1047.
4
Homotypic regulatory clusters in Drosophila.果蝇中的同型调控簇。
Genome Res. 2003 Apr;13(4):579-88. doi: 10.1101/gr.668403.
5
Functional evolution of cis-regulatory modules at a homeotic gene in Drosophila.果蝇同源异型基因顺式调控模块的功能进化。
PLoS Genet. 2009 Nov;5(11):e1000709. doi: 10.1371/journal.pgen.1000709. Epub 2009 Nov 6.
6
Computational annotation of transcription factor binding sites in D. Melanogaster developmental genes.黑腹果蝇发育基因中转录因子结合位点的计算注释
Genome Inform. 2006;17(2):14-24.
7
Computation-based discovery of cis-regulatory modules by hidden Markov model.基于计算方法,通过隐马尔可夫模型发现顺式调控模块。
J Comput Biol. 2008 Apr;15(3):279-90. doi: 10.1089/cmb.2008.0024.
8
Computational methods for the detection of cis-regulatory modules.用于检测顺式调控模块的计算方法。
Brief Bioinform. 2009 Sep;10(5):509-24. doi: 10.1093/bib/bbp025. Epub 2009 Jun 4.
9
Expression-guided in silico evaluation of candidate cis regulatory codes for Drosophila muscle founder cells.果蝇肌肉奠基细胞候选顺式调控编码的表达引导计算机模拟评估
PLoS Comput Biol. 2006 May;2(5):e53. doi: 10.1371/journal.pcbi.0020053. Epub 2006 May 26.
10
Computational detection of genomic cis-regulatory modules applied to body patterning in the early Drosophila embryo.应用于早期果蝇胚胎身体模式形成的基因组顺式调控模块的计算检测。
BMC Bioinformatics. 2002 Oct 24;3:30. doi: 10.1186/1471-2105-3-30.

引用本文的文献

1
Imogene: identification of motifs and cis-regulatory modules underlying gene co-regulation.伊莫金:基因共调控背后的基序和顺式调控模块的鉴定。
Nucleic Acids Res. 2014 Jun;42(10):6128-45. doi: 10.1093/nar/gku209. Epub 2014 Mar 25.
2
A statistical thin-tail test of predicting regulatory regions in the Drosophila genome.一种预测果蝇基因组调控区域的统计薄尾检验。
Theor Biol Med Model. 2013 Feb 14;10:11. doi: 10.1186/1742-4682-10-11.
3
A machine learning approach for identifying novel cell type-specific transcriptional regulators of myogenesis.

本文引用的文献

1
Recognition of eukaryotic promoters using a genetic algorithm based on iterative discriminant analysis.基于迭代判别分析的遗传算法识别真核生物启动子。
In Silico Biol. 2003;3(1-2):81-7. Epub 2003 Feb 27.
2
Homotypic regulatory clusters in Drosophila.果蝇中的同型调控簇。
Genome Res. 2003 Apr;13(4):579-88. doi: 10.1101/gr.668403.
3
Phylogenetic shadowing of primate sequences to find functional regions of the human genome.通过对灵长类序列进行系统发育影子分析来寻找人类基因组的功能区域。
一种用于鉴定成肌细胞新型细胞类型特异性转录调控因子的机器学习方法。
PLoS Genet. 2012;8(3):e1002531. doi: 10.1371/journal.pgen.1002531. Epub 2012 Mar 8.
4
Genome-wide identification of cis-regulatory motifs and modules underlying gene coregulation using statistics and phylogeny.基于统计学和系统发生学的全基因组鉴定基因协同调控的顺式调控基序和模块。
Proc Natl Acad Sci U S A. 2010 Aug 17;107(33):14615-20. doi: 10.1073/pnas.1002876107. Epub 2010 Jul 29.
5
An alignment-free method to identify candidate orthologous enhancers in multiple Drosophila genomes.一种在多个果蝇基因组中识别候选直系增强子的无比对方法。
Bioinformatics. 2010 Sep 1;26(17):2109-15. doi: 10.1093/bioinformatics/btq358. Epub 2010 Jul 11.
6
Most transcription factor binding sites are in a few mosaic classes of the human genome.大多数转录因子结合位点位于人类基因组的少数镶嵌类群中。
BMC Genomics. 2010 May 6;11:286. doi: 10.1186/1471-2164-11-286.
7
Motif-blind, genome-wide discovery of cis-regulatory modules in Drosophila and mouse.在果蝇和小鼠中进行的全基因组范围内顺式调控模块的基序盲发现。
Dev Cell. 2009 Oct;17(4):568-79. doi: 10.1016/j.devcel.2009.09.002.
8
Identifying cis-regulatory sequences by word profile similarity.通过词频相似性识别顺式调控序列。
PLoS One. 2009 Sep 4;4(9):e6901. doi: 10.1371/journal.pone.0006901.
9
Identifying regulatory elements in eukaryotic genomes.识别真核生物基因组中的调控元件。
Brief Funct Genomic Proteomic. 2009 Jul;8(4):215-30. doi: 10.1093/bfgp/elp014. Epub 2009 Jun 4.
10
Finding evolutionarily conserved cis-regulatory modules with a universal set of motifs.利用一组通用基序寻找进化上保守的顺式调控模块。
BMC Bioinformatics. 2009 Mar 10;10:82. doi: 10.1186/1471-2105-10-82.
Science. 2003 Feb 28;299(5611):1391-4. doi: 10.1126/science.1081331.
4
Assessing the impact of comparative genomic sequence data on the functional annotation of the Drosophila genome.评估比较基因组序列数据对果蝇基因组功能注释的影响。
Genome Biol. 2002;3(12):RESEARCH0086. doi: 10.1186/gb-2002-3-12-research0086. Epub 2002 Dec 30.
5
Strategies and tools for whole-genome alignments.全基因组比对的策略与工具。
Genome Res. 2003 Jan;13(1):73-80. doi: 10.1101/gr.762503.
6
PipTools: a computational toolkit to annotate and analyze pairwise comparisons of genomic sequences.PipTools:一个用于注释和分析基因组序列成对比较的计算工具包。
Genomics. 2002 Dec;80(6):681-90. doi: 10.1006/geno.2002.7018.
7
Computational detection of genomic cis-regulatory modules applied to body patterning in the early Drosophila embryo.应用于早期果蝇胚胎身体模式形成的基因组顺式调控模块的计算检测。
BMC Bioinformatics. 2002 Oct 24;3:30. doi: 10.1186/1471-2105-3-30.
8
Anterior repression of a Drosophila stripe enhancer requires three position-specific mechanisms.果蝇条纹增强子的前部抑制需要三种位置特异性机制。
Development. 2002 Nov;129(21):4931-40. doi: 10.1242/dev.129.21.4931.
9
Sharp borders from fuzzy gradients.清晰边界源于模糊渐变。
Trends Genet. 2002 Aug;18(8):385-7. doi: 10.1016/s0168-9525(02)02724-5.
10
Patchy interspecific sequence similarities efficiently identify positive cis-regulatory elements in the sea urchin.斑驳的种间序列相似性有效地识别出海胆中的正向顺式调控元件。
Dev Biol. 2002 Jun 1;246(1):148-61. doi: 10.1006/dbio.2002.0618.