• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种系统发育基因组基因簇资源:系统发育推断组(PhIGs)数据库。

A phylogenomic gene cluster resource: the Phylogenetically Inferred Groups (PhIGs) database.

作者信息

Dehal Paramvir S, Boore Jeffrey L

机构信息

Evolutionary Genomics Department, DOE Joint Genome Institute and Lawrence, Berkeley National Laboratory, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA.

出版信息

BMC Bioinformatics. 2006 Apr 11;7:201. doi: 10.1186/1471-2105-7-201.

DOI:10.1186/1471-2105-7-201
PMID:16608522
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC1523372/
Abstract

BACKGROUND

We present here the PhIGs database, a phylogenomic resource for sequenced genomes. Although many methods exist for clustering gene families, very few attempt to create truly orthologous clusters sharing descent from a single ancestral gene across a range of evolutionary depths. Although these non-phylogenetic gene family clusters have been used broadly for gene annotation, errors are known to be introduced by the artifactual association of slowly evolving paralogs and lack of annotation for those more rapidly evolving. A full phylogenetic framework is necessary for accurate inference of function and for many studies that address pattern and mechanism of the evolution of the genome. The automated generation of evolutionary gene clusters, creation of gene trees, determination of orthology and paralogy relationships, and the correlation of this information with gene annotations, expression information, and genomic context is an important resource to the scientific community.

DISCUSSION

The PhIGs database currently contains 23 completely sequenced genomes of fungi and metazoans, containing 409,653 genes that have been grouped into 42,645 gene clusters. Each gene cluster is built such that the gene sequence distances are consistent with the known organismal relationships and in so doing, maximizing the likelihood for the clusters to represent truly orthologous genes. The PhIGs website contains tools that allow the study of genes within their phylogenetic framework through keyword searches on annotations, such as GO and InterPro assignments, and sequence similarity searches by BLAST and HMM. In addition to displaying the evolutionary relationships of the genes in each cluster, the website also allows users to view the relative physical positions of homologous genes in specified sets of genomes.

SUMMARY

Accurate analyses of genes and genomes can only be done within their full phylogenetic context. The PhIGs database and corresponding website http://phigs.org address this problem for the scientific community. Our goal is to expand the content as more genomes are sequenced and use this framework to incorporate more analyses.

摘要

背景

我们在此展示PhIGs数据库,这是一个用于已测序基因组的系统发育基因组学资源。尽管存在多种用于聚类基因家族的方法,但很少有方法尝试创建真正的直系同源聚类,这些聚类在一系列进化深度上共享来自单个祖先基因的遗传。虽然这些非系统发育的基因家族聚类已被广泛用于基因注释,但已知缓慢进化的旁系同源物的人为关联会引入错误,并且对于那些进化较快的基因缺乏注释。完整的系统发育框架对于准确推断功能以及许多研究基因组进化模式和机制的研究来说是必要的。自动生成进化基因聚类、创建基因树、确定直系同源和旁系同源关系,以及将这些信息与基因注释、表达信息和基因组背景相关联,是科学界的一项重要资源。

讨论

PhIGs数据库目前包含23个真菌和后生动物的完全测序基因组,其中包含已被分组为42,645个基因聚类的409,653个基因。每个基因聚类的构建方式使得基因序列距离与已知的生物关系一致,并且这样做能最大程度地使聚类代表真正的直系同源基因。PhIGs网站包含一些工具,可通过对注释(如GO和InterPro分配)进行关键字搜索以及通过BLAST和HMM进行序列相似性搜索,来研究系统发育框架内的基因。除了展示每个聚类中基因的进化关系外,该网站还允许用户查看指定基因组集中同源基因的相对物理位置。

总结

只有在完整的系统发育背景下才能对基因和基因组进行准确分析。PhIGs数据库及相应网站http://phigs.org为科学界解决了这个问题。我们旨在随着更多基因组被测序而扩展内容,并利用这个框架纳入更多分析。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a2e/1523372/cfe36443e9af/1471-2105-7-201-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a2e/1523372/23427e686384/1471-2105-7-201-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a2e/1523372/b58a412809d4/1471-2105-7-201-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a2e/1523372/b0ba5c7a5450/1471-2105-7-201-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a2e/1523372/cfe36443e9af/1471-2105-7-201-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a2e/1523372/23427e686384/1471-2105-7-201-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a2e/1523372/b58a412809d4/1471-2105-7-201-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a2e/1523372/b0ba5c7a5450/1471-2105-7-201-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a2e/1523372/cfe36443e9af/1471-2105-7-201-4.jpg

相似文献

1
A phylogenomic gene cluster resource: the Phylogenetically Inferred Groups (PhIGs) database.一种系统发育基因组基因簇资源:系统发育推断组(PhIGs)数据库。
BMC Bioinformatics. 2006 Apr 11;7:201. doi: 10.1186/1471-2105-7-201.
2
DETECTING EVOLUTIONARY TRANSFER OF GENES USING PhIGs(1).使用噬菌体岛检测基因的进化转移(1)。
J Phycol. 2008 Feb;44(1):19-22. doi: 10.1111/j.1529-8817.2007.00436.x.
3
GeneTools--application for functional annotation and statistical hypothesis testing.基因工具——用于功能注释和统计假设检验的应用程序。
BMC Bioinformatics. 2006 Oct 24;7:470. doi: 10.1186/1471-2105-7-470.
4
PhyloPat: phylogenetic pattern analysis of eukaryotic genes.PhyloPat:真核基因的系统发育模式分析
BMC Bioinformatics. 2006 Sep 1;7:398. doi: 10.1186/1471-2105-7-398.
5
Assessment of phylogenomic and orthology approaches for phylogenetic inference.用于系统发育推断的系统发育基因组学和直系同源方法评估。
Bioinformatics. 2007 Apr 1;23(7):815-24. doi: 10.1093/bioinformatics/btm015. Epub 2007 Jan 19.
6
OrthologID: automation of genome-scale ortholog identification within a parsimony framework.直系同源物ID:简约框架内全基因组规模直系同源物鉴定的自动化
Bioinformatics. 2006 Mar 15;22(6):699-707. doi: 10.1093/bioinformatics/btk040. Epub 2006 Jan 12.
7
FUNYBASE: a FUNgal phYlogenomic dataBASE.真菌系统发育基因组数据库(FUNYBASE)
BMC Bioinformatics. 2008 Oct 27;9:456. doi: 10.1186/1471-2105-9-456.
8
GenomeBlast: a web tool for small genome comparison.基因组比对工具(GenomeBlast):一种用于小型基因组比较的网络工具。
BMC Bioinformatics. 2006 Dec 12;7 Suppl 4(Suppl 4):S18. doi: 10.1186/1471-2105-7-S4-S18.
9
HoSeqI: automated homologous sequence identification in gene family databases.HoSeqI:基因家族数据库中的自动同源序列识别
Bioinformatics. 2006 Jul 15;22(14):1786-7. doi: 10.1093/bioinformatics/btl179. Epub 2006 May 8.
10
Accurate prediction of orthologous gene groups in microbes.微生物中直系同源基因组的准确预测。
Proc IEEE Comput Syst Bioinform Conf. 2005:73-9. doi: 10.1109/csb.2005.10.

引用本文的文献

1
Supracellular organization confers directionality and mechanical potency to migrating pairs of cardiopharyngeal progenitor cells.细胞超结构赋予了成对迁移的心耳祖细胞方向性和机械效力。
Elife. 2021 Nov 29;10:e70977. doi: 10.7554/eLife.70977.
2
xenoGI: reconstructing the history of genomic island insertions in clades of closely related bacteria.xenoGI:重建密切相关细菌类群中基因组岛插入的历史。
BMC Bioinformatics. 2018 Feb 5;19(1):32. doi: 10.1186/s12859-018-2038-0.
3
Inferring Orthologs: Open Questions and Perspectives.推断直系同源基因:未解决的问题与展望

本文引用的文献

1
Two rounds of whole genome duplication in the ancestral vertebrate.在脊椎动物祖先中发生了两轮全基因组复制。
PLoS Biol. 2005 Oct;3(10):e314. doi: 10.1371/journal.pbio.0030314. Epub 2005 Sep 6.
2
Inparanoid: a comprehensive database of eukaryotic orthologs.Inparanoid:真核生物直系同源基因综合数据库。
Nucleic Acids Res. 2005 Jan 1;33(Database issue):D476-80. doi: 10.1093/nar/gki107.
3
The Jalview Java alignment editor.Jalview Java序列比对编辑器。
Genomics Insights. 2016 Feb 25;9:17-28. doi: 10.4137/GEI.S37925. eCollection 2016.
4
Visualizing genome and systems biology: technologies, tools, implementation techniques and trends, past, present and future.可视化基因组与系统生物学:技术、工具、实施方法及趋势,过去、现在与未来
Gigascience. 2015 Aug 25;4:38. doi: 10.1186/s13742-015-0077-2. eCollection 2015.
5
TreeFam v9: a new website, more species and orthology-on-the-fly.TreeFam v9:一个新网站,更多的物种和即时同源性。
Nucleic Acids Res. 2014 Jan;42(Database issue):D922-5. doi: 10.1093/nar/gkt1055. Epub 2013 Nov 4.
6
Phytozome: a comparative platform for green plant genomics.植物生物学数据库:一个用于绿色植物基因组学的比较平台。
Nucleic Acids Res. 2012 Jan;40(Database issue):D1178-86. doi: 10.1093/nar/gkr944. Epub 2011 Nov 22.
7
Ultra-fast sequence clustering from similarity networks with SiLiX.使用 SiLiX 从相似度网络中进行超快速序列聚类。
BMC Bioinformatics. 2011 Apr 22;12:116. doi: 10.1186/1471-2105-12-116.
8
A Bayesian approach for fast and accurate gene tree reconstruction.一种快速准确的基因树重建的贝叶斯方法。
Mol Biol Evol. 2011 Jan;28(1):273-90. doi: 10.1093/molbev/msq189. Epub 2010 Jul 25.
9
Using phylogenomic patterns and gene ontology to identify proteins of importance in plant evolution.利用系统发生基因组学模式和基因本体论鉴定植物进化中具有重要意义的蛋白质。
Genome Biol Evol. 2010 Jul 12;2:225-39. doi: 10.1093/gbe/evq012.
10
Visualizing genomes: techniques and challenges.基因组可视化:技术与挑战。
Nat Methods. 2010 Mar;7(3 Suppl):S5-S15. doi: 10.1038/nmeth.1422. Epub 2010 Feb 25.
Bioinformatics. 2004 Feb 12;20(3):426-7. doi: 10.1093/bioinformatics/btg430. Epub 2004 Jan 22.
4
The COG database: an updated version includes eukaryotes.COG数据库:更新版本涵盖真核生物。
BMC Bioinformatics. 2003 Sep 11;4:41. doi: 10.1186/1471-2105-4-41.
5
OrthoMCL: identification of ortholog groups for eukaryotic genomes.OrthoMCL:真核生物基因组直系同源组的鉴定
Genome Res. 2003 Sep;13(9):2178-89. doi: 10.1101/gr.1224503.
6
Domains, motifs and clusters in the protein universe.蛋白质世界中的结构域、基序和簇。
Curr Opin Chem Biol. 2003 Feb;7(1):5-11. doi: 10.1016/s1367-5931(02)00003-0.
7
Predicting functional divergence in protein evolution by site-specific rate shifts.通过位点特异性速率变化预测蛋白质进化中的功能分化
Trends Biochem Sci. 2002 Jun;27(6):315-21. doi: 10.1016/s0968-0004(02)02094-7.
8
TREE-PUZZLE: maximum likelihood phylogenetic analysis using quartets and parallel computing.TREE-PUZZLE:使用四重奏和并行计算的最大似然系统发育分析。
Bioinformatics. 2002 Mar;18(3):502-4. doi: 10.1093/bioinformatics/18.3.502.
9
An efficient algorithm for large-scale detection of protein families.一种用于大规模检测蛋白质家族的高效算法。
Nucleic Acids Res. 2002 Apr 1;30(7):1575-84. doi: 10.1093/nar/30.7.1575.
10
Cross-referencing eukaryotic genomes: TIGR Orthologous Gene Alignments (TOGA).交叉引用真核生物基因组:TIGR直系同源基因比对(TOGA)。
Genome Res. 2002 Mar;12(3):493-502. doi: 10.1101/gr.212002.