Suppr超能文献

一种系统发育基因组基因簇资源:系统发育推断组(PhIGs)数据库。

A phylogenomic gene cluster resource: the Phylogenetically Inferred Groups (PhIGs) database.

作者信息

Dehal Paramvir S, Boore Jeffrey L

机构信息

Evolutionary Genomics Department, DOE Joint Genome Institute and Lawrence, Berkeley National Laboratory, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA.

出版信息

BMC Bioinformatics. 2006 Apr 11;7:201. doi: 10.1186/1471-2105-7-201.

Abstract

BACKGROUND

We present here the PhIGs database, a phylogenomic resource for sequenced genomes. Although many methods exist for clustering gene families, very few attempt to create truly orthologous clusters sharing descent from a single ancestral gene across a range of evolutionary depths. Although these non-phylogenetic gene family clusters have been used broadly for gene annotation, errors are known to be introduced by the artifactual association of slowly evolving paralogs and lack of annotation for those more rapidly evolving. A full phylogenetic framework is necessary for accurate inference of function and for many studies that address pattern and mechanism of the evolution of the genome. The automated generation of evolutionary gene clusters, creation of gene trees, determination of orthology and paralogy relationships, and the correlation of this information with gene annotations, expression information, and genomic context is an important resource to the scientific community.

DISCUSSION

The PhIGs database currently contains 23 completely sequenced genomes of fungi and metazoans, containing 409,653 genes that have been grouped into 42,645 gene clusters. Each gene cluster is built such that the gene sequence distances are consistent with the known organismal relationships and in so doing, maximizing the likelihood for the clusters to represent truly orthologous genes. The PhIGs website contains tools that allow the study of genes within their phylogenetic framework through keyword searches on annotations, such as GO and InterPro assignments, and sequence similarity searches by BLAST and HMM. In addition to displaying the evolutionary relationships of the genes in each cluster, the website also allows users to view the relative physical positions of homologous genes in specified sets of genomes.

SUMMARY

Accurate analyses of genes and genomes can only be done within their full phylogenetic context. The PhIGs database and corresponding website http://phigs.org address this problem for the scientific community. Our goal is to expand the content as more genomes are sequenced and use this framework to incorporate more analyses.

摘要

背景

我们在此展示PhIGs数据库,这是一个用于已测序基因组的系统发育基因组学资源。尽管存在多种用于聚类基因家族的方法,但很少有方法尝试创建真正的直系同源聚类,这些聚类在一系列进化深度上共享来自单个祖先基因的遗传。虽然这些非系统发育的基因家族聚类已被广泛用于基因注释,但已知缓慢进化的旁系同源物的人为关联会引入错误,并且对于那些进化较快的基因缺乏注释。完整的系统发育框架对于准确推断功能以及许多研究基因组进化模式和机制的研究来说是必要的。自动生成进化基因聚类、创建基因树、确定直系同源和旁系同源关系,以及将这些信息与基因注释、表达信息和基因组背景相关联,是科学界的一项重要资源。

讨论

PhIGs数据库目前包含23个真菌和后生动物的完全测序基因组,其中包含已被分组为42,645个基因聚类的409,653个基因。每个基因聚类的构建方式使得基因序列距离与已知的生物关系一致,并且这样做能最大程度地使聚类代表真正的直系同源基因。PhIGs网站包含一些工具,可通过对注释(如GO和InterPro分配)进行关键字搜索以及通过BLAST和HMM进行序列相似性搜索,来研究系统发育框架内的基因。除了展示每个聚类中基因的进化关系外,该网站还允许用户查看指定基因组集中同源基因的相对物理位置。

总结

只有在完整的系统发育背景下才能对基因和基因组进行准确分析。PhIGs数据库及相应网站http://phigs.org为科学界解决了这个问题。我们旨在随着更多基因组被测序而扩展内容,并利用这个框架纳入更多分析。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a2e/1523372/23427e686384/1471-2105-7-201-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验