• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于人类启动子预测的寡核苷酸位置密度的计算建模。

Computational modeling of oligonucleotide positional densities for human promoter prediction.

作者信息

Narang Vipin, Sung Wing-Kin, Mittal Ankush

机构信息

Department of Computer Science, S16 #06-02, 3 Science Drive 2, National University of Singapore, Singapore 117543, Singapore.

出版信息

Artif Intell Med. 2005 Sep-Oct;35(1-2):107-19. doi: 10.1016/j.artmed.2005.02.005.

DOI:10.1016/j.artmed.2005.02.005
PMID:16076553
Abstract

OBJECTIVE

The gene promoter region controls transcriptional initiation of a gene, which is the most important step in gene regulation. In-silico detection of promoter region in genomic sequences has a number of applications in gene discovery and understanding gene expression regulation. However, computational prediction of eukaryotic poly-II promoters has remained a difficult task. This paper introduces a novel statistical technique for detecting promoter regions in long genomic sequences.

METHOD

A number of existing techniques analyze the occurrence frequencies of oligonucleotides in promoter sequences as compared to other genomic regions. In contrast, the present work studies the positional densities of oligonucleotides in promoter sequences. The analysis does not require any non-promoter sequence dataset or any model of the background oligonucleotide content of the genome. The statistical model learnt from a dataset of promoter sequences automatically recognizes a number of transcription factor binding sites simultaneously with their occurrence positions relative to the transcription start site. Based on this model, a continuous naïve Bayes classifier is developed for the detection of human promoters and transcription start sites in genomic sequences.

RESULTS

The present study extends the scope of statistical models in general promoter modeling and prediction. Promoter sequence features learnt by the model correlate well with known biological facts. Results of human transcription start site prediction compare favorably with existing 2nd generation promoter prediction tools.

摘要

目的

基因启动子区域控制基因的转录起始,这是基因调控中最重要的步骤。在基因组序列中通过计算机模拟检测启动子区域在基因发现和理解基因表达调控方面有许多应用。然而,真核生物聚合酶II启动子的计算预测仍然是一项艰巨的任务。本文介绍了一种用于检测长基因组序列中启动子区域的新型统计技术。

方法

许多现有技术通过比较启动子序列与其他基因组区域中寡核苷酸的出现频率来进行分析。相比之下,本研究考察启动子序列中寡核苷酸的位置密度。该分析不需要任何非启动子序列数据集或基因组背景寡核苷酸含量的任何模型。从启动子序列数据集中学习到的统计模型会同时自动识别多个转录因子结合位点及其相对于转录起始位点的出现位置。基于此模型,开发了一种连续朴素贝叶斯分类器,用于检测基因组序列中的人类启动子和转录起始位点。

结果

本研究扩展了统计模型在一般启动子建模和预测方面的范围。该模型所学习到的启动子序列特征与已知生物学事实高度相关。人类转录起始位点预测结果与现有的第二代启动子预测工具相比更具优势。

相似文献

1
Computational modeling of oligonucleotide positional densities for human promoter prediction.用于人类启动子预测的寡核苷酸位置密度的计算建模。
Artif Intell Med. 2005 Sep-Oct;35(1-2):107-19. doi: 10.1016/j.artmed.2005.02.005.
2
EnsemPro: an ensemble approach to predicting transcription start sites in human genomic DNA sequences.EnsemPro:一种用于预测人类基因组DNA序列中转录起始位点的集成方法。
Genomics. 2008 Mar;91(3):259-66. doi: 10.1016/j.ygeno.2007.11.001.
3
Computational detection and location of transcription start sites in mammalian genomic DNA.哺乳动物基因组DNA中转录起始位点的计算检测与定位
Genome Res. 2002 Mar;12(3):458-61. doi: 10.1101/gr.216102.
4
Modeling promoter grammars with evolving hidden Markov models.使用进化隐马尔可夫模型对启动子语法进行建模。
Bioinformatics. 2008 Aug 1;24(15):1669-75. doi: 10.1093/bioinformatics/btn254. Epub 2008 Jun 5.
5
Content analysis of the core promoter region of human genes.人类基因核心启动子区域的内容分析。
In Silico Biol. 2004;4(2):109-25.
6
Integrating genomic data to predict transcription factor binding.整合基因组数据以预测转录因子结合
Genome Inform. 2005;16(1):83-94.
7
A novel strategy to search conserved transcription factor binding sites among coexpressing genes in human.一种在人类共表达基因中搜索保守转录因子结合位点的新策略。
Genome Inform. 2008;20:212-21.
8
Genome-wide prediction of transcriptional regulatory elements of human promoters using gene expression and promoter analysis data.利用基因表达和启动子分析数据对人类启动子的转录调控元件进行全基因组预测。
BMC Bioinformatics. 2006 Jul 4;7:330. doi: 10.1186/1471-2105-7-330.
9
Computational detection of prokaryotic core promoters in genomic sequences.基因组序列中原核生物核心启动子的计算检测
J Microbiol. 2005 Oct;43(5):411-6.
10
Using simple rules on presence and positioning of motifs for promoter structure modeling and tissue-specific expression prediction.利用关于基序存在和定位的简单规则进行启动子结构建模和组织特异性表达预测。
Genome Inform. 2008;21:188-99.

引用本文的文献

1
Supervised promoter recognition: a benchmark framework.监督启动子识别:基准框架。
BMC Bioinformatics. 2022 Apr 2;23(1):118. doi: 10.1186/s12859-022-04647-5.
2
Genome-Wide Prediction of Transcription Start Sites in Conifers.全基因组预测针叶树转录起始位点。
Int J Mol Sci. 2022 Feb 3;23(3):1735. doi: 10.3390/ijms23031735.
3
Critical assessment of computational tools for prokaryotic and eukaryotic promoter prediction.原核生物和真核生物启动子预测的计算工具的批判性评估。
Brief Bioinform. 2022 Mar 10;23(2). doi: 10.1093/bib/bbab551.
4
TSSFinder-fast and accurate ab initio prediction of the core promoter in eukaryotic genomes.TSSFinder——真核基因组中核心启动子的快速、准确从头预测。
Brief Bioinform. 2021 Nov 5;22(6). doi: 10.1093/bib/bbab198.
5
Long-read assays shed new light on the transcriptome complexity of a viral pathogen.长读测序技术为病毒病原体转录组的复杂性提供了新的认识。
Sci Rep. 2020 Aug 14;10(1):13822. doi: 10.1038/s41598-020-70794-5.
6
Putative Auxin and Light Responsive Promoter Elements From the Genome, When Expressed as cDNA, Are Functional in .来自基因组的假定生长素和光响应启动子元件,当作为cDNA表达时,在……中具有功能。
Front Plant Sci. 2019 Jun 28;10:804. doi: 10.3389/fpls.2019.00804. eCollection 2019.
7
Alternative Splicing and Protein Diversity: Plants Versus Animals.可变剪接与蛋白质多样性:植物与动物
Front Plant Sci. 2019 Jun 12;10:708. doi: 10.3389/fpls.2019.00708. eCollection 2019.
8
Image-based promoter prediction: a promoter prediction method based on evolutionarily generated patterns.基于图像的启动子预测:一种基于进化生成模式的启动子预测方法。
Sci Rep. 2018 Dec 6;8(1):17695. doi: 10.1038/s41598-018-36308-0.
9
Long-read sequencing uncovers a complex transcriptome topology in varicella zoster virus.长读长测序揭示了水痘带状疱疹病毒复杂的转录组拓扑结构。
BMC Genomics. 2018 Dec 4;19(1):873. doi: 10.1186/s12864-018-5267-8.
10
GPMiner: an integrated system for mining combinatorial cis-regulatory elements in mammalian gene group.GPMiner:一个用于挖掘哺乳动物基因组合调控元件的集成系统。
BMC Genomics. 2012;13 Suppl 1(Suppl 1):S3. doi: 10.1186/1471-2164-13-S1-S3. Epub 2012 Jan 17.