• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

人类和小鼠基因组中编码和非编码保守序列标签的全基因组鉴定。

Genome-wide identification of coding and non-coding conserved sequence tags in human and mouse genomes.

作者信息

Mignone Flavio, Anselmo Anna, Donvito Giacinto, Maggi Giorgio P, Grillo Giorgio, Pesole Graziano

机构信息

Department of Structural Chemistry and Inorganic Stereochemistry, School of Pharmacy, University of Milan, Italy.

出版信息

BMC Genomics. 2008 Jun 11;9:277. doi: 10.1186/1471-2164-9-277.

DOI:10.1186/1471-2164-9-277
PMID:18547402
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2442843/
Abstract

BACKGROUND

The accurate detection of genes and the identification of functional regions is still an open issue in the annotation of genomic sequences. This problem affects new genomes but also those of very well studied organisms such as human and mouse where, despite the great efforts, the inventory of genes and regulatory regions is far from complete. Comparative genomics is an effective approach to address this problem. Unfortunately it is limited by the computational requirements needed to perform genome-wide comparisons and by the problem of discriminating between conserved coding and non-coding sequences. This discrimination is often based (thus dependent) on the availability of annotated proteins.

RESULTS

In this paper we present the results of a comprehensive comparison of human and mouse genomes performed with a new high throughput grid-based system which allows the rapid detection of conserved sequences and accurate assessment of their coding potential. By detecting clusters of coding conserved sequences the system is also suitable to accurately identify potential gene loci. Following this analysis we created a collection of human-mouse conserved sequence tags and carefully compared our results to reliable annotations in order to benchmark the reliability of our classifications. Strikingly we were able to detect several potential gene loci supported by EST sequences but not corresponding to as yet annotated genes.

CONCLUSION

Here we present a new system which allows comprehensive comparison of genomes to detect conserved coding and non-coding sequences and the identification of potential gene loci. Our system does not require the availability of any annotated sequence thus is suitable for the analysis of new or poorly annotated genomes.

摘要

背景

在基因组序列注释中,基因的准确检测和功能区域的识别仍是一个未解决的问题。这个问题不仅影响新的基因组,也影响那些已被深入研究的生物体的基因组,如人类和小鼠的基因组,尽管付出了巨大努力,但基因和调控区域的清单仍远未完整。比较基因组学是解决这个问题的有效方法。不幸的是,它受到全基因组比较所需的计算要求以及区分保守编码和非编码序列问题的限制。这种区分通常基于(因此依赖于)注释蛋白质的可用性。

结果

在本文中,我们展示了使用一种基于网格的新高通量系统对人类和小鼠基因组进行全面比较的结果,该系统能够快速检测保守序列并准确评估其编码潜力。通过检测编码保守序列的簇,该系统也适用于准确识别潜在的基因位点。经过此分析,我们创建了一组人鼠保守序列标签,并将我们的结果与可靠注释进行仔细比较,以便对我们分类的可靠性进行基准测试。令人惊讶的是,我们能够检测到几个由EST序列支持但尚未对应于已注释基因的潜在基因位点。

结论

在这里,我们展示了一个新系统,它允许对基因组进行全面比较,以检测保守的编码和非编码序列并识别潜在的基因位点。我们的系统不需要任何注释序列的可用性,因此适用于分析新的或注释不佳的基因组。

相似文献

1
Genome-wide identification of coding and non-coding conserved sequence tags in human and mouse genomes.人类和小鼠基因组中编码和非编码保守序列标签的全基因组鉴定。
BMC Genomics. 2008 Jun 11;9:277. doi: 10.1186/1471-2164-9-277.
2
Comparison of RefSeq protein-coding regions in human and vertebrate genomes.比较人类和脊椎动物基因组中的 RefSeq 编码蛋白区域。
BMC Genomics. 2013 Sep 25;14:654. doi: 10.1186/1471-2164-14-654.
3
[Analysis, identification and correction of some errors of model refseqs appeared in NCBI Human Gene Database by in silico cloning and experimental verification of novel human genes].[通过新型人类基因的电子克隆和实验验证对NCBI人类基因数据库中出现的模型参考序列的一些错误进行分析、鉴定和校正]
Yi Chuan Xue Bao. 2004 May;31(5):431-43.
4
Conserved introns reveal novel transcripts in Drosophila melanogaster.保守内含子揭示了黑腹果蝇中的新转录本。
Genome Res. 2009 Jul;19(7):1289-300. doi: 10.1101/gr.090050.108. Epub 2009 May 20.
5
Characterization of 954 bovine full-CDS cDNA sequences.954条牛全长编码序列(CDS)cDNA序列的特征分析
BMC Genomics. 2005 Nov 23;6:166. doi: 10.1186/1471-2164-6-166.
6
Computational identification of protein coding potential of conserved sequence tags through cross-species evolutionary analysis.通过跨物种进化分析对保守序列标签的蛋白质编码潜力进行计算鉴定。
Nucleic Acids Res. 2003 Aug 1;31(15):4639-45. doi: 10.1093/nar/gkg483.
7
High-throughput, kingdom-wide prediction and annotation of bacterial non-coding RNAs.细菌非编码RNA的高通量、全领域预测与注释
PLoS One. 2008 Sep 12;3(9):e3197. doi: 10.1371/journal.pone.0003197.
8
Genome-wide identification of conserved intronic non-coding sequences using a Bayesian segmentation approach.使用贝叶斯分割方法对保守内含子非编码序列进行全基因组鉴定。
BMC Genomics. 2017 Mar 27;18(1):259. doi: 10.1186/s12864-017-3645-2.
9
GenoMiner: a tool for genome-wide search of coding and non-coding conserved sequence tags.基因挖掘器:一种用于全基因组搜索编码和非编码保守序列标签的工具。
Bioinformatics. 2006 Feb 15;22(4):497-9. doi: 10.1093/bioinformatics/bti754. Epub 2005 Nov 2.
10
Identification and analysis of Arabidopsis expressed sequence tags characteristic of non-coding RNAs.拟南芥非编码RNA特征性表达序列标签的鉴定与分析。
Plant Physiol. 2001 Nov;127(3):765-76.

引用本文的文献

1
A transcriptional sketch of a primary human breast cancer by 454 deep sequencing.通过454深度测序绘制的原发性人类乳腺癌转录图谱。
BMC Genomics. 2009 Apr 20;10:163. doi: 10.1186/1471-2164-10-163.

本文引用的文献

1
Ancora: a web resource for exploring highly conserved noncoding elements and their association with developmental regulatory genes.Ancora:一个用于探索高度保守非编码元件及其与发育调控基因关联的网络资源。
Genome Biol. 2008;9(2):R34. doi: 10.1186/gb-2008-9-2-r34. Epub 2008 Feb 15.
2
CONDOR: a database resource of developmentally associated conserved non-coding elements.CONDOR:发育相关保守非编码元件的数据库资源。
BMC Dev Biol. 2007 Aug 30;7:100. doi: 10.1186/1471-213X-7-100.
3
Fast-evolving noncoding sequences in the human genome.
人类基因组中快速进化的非编码序列。
Genome Biol. 2007;8(6):R118. doi: 10.1186/gb-2007-8-6-r118.
4
The UCSC genome browser database: update 2007.加州大学圣克鲁兹分校基因组浏览器数据库:2007年更新
Nucleic Acids Res. 2007 Jan;35(Database issue):D668-73. doi: 10.1093/nar/gkl928. Epub 2006 Nov 16.
5
ECRbase: database of evolutionary conserved regions, promoters, and transcription factor binding sites in vertebrate genomes.ECRbase:脊椎动物基因组中进化保守区域、启动子及转录因子结合位点的数据库。
Bioinformatics. 2007 Jan 1;23(1):122-4. doi: 10.1093/bioinformatics/btl546. Epub 2006 Nov 7.
6
In vivo enhancer analysis of human conserved non-coding sequences.人类保守非编码序列的体内增强子分析
Nature. 2006 Nov 23;444(7118):499-502. doi: 10.1038/nature05295. Epub 2006 Nov 5.
7
Accelerated evolution of conserved noncoding sequences in humans.人类保守非编码序列的加速进化
Science. 2006 Nov 3;314(5800):786. doi: 10.1126/science.1130738.
8
A computational approach for identifying pseudogenes in the ENCODE regions.一种用于识别ENCODE区域中假基因的计算方法。
Genome Biol. 2006;7 Suppl 1(Suppl 1):S13.1-10. doi: 10.1186/gb-2006-7-s1-s13. Epub 2006 Aug 7.
9
Non-coding RNA.非编码RNA
Hum Mol Genet. 2006 Apr 15;15 Spec No 1:R17-29. doi: 10.1093/hmg/ddl046.
10
ORegAnno: an open access database and curation system for literature-derived promoters, transcription factor binding sites and regulatory variation.ORegAnno:一个用于文献衍生启动子、转录因子结合位点和调控变异的开放获取数据库及注释系统。
Bioinformatics. 2006 Mar 1;22(5):637-40. doi: 10.1093/bioinformatics/btk027. Epub 2006 Jan 5.