Suppr超能文献

使用HOPS数据库对直系同源蛋白结构域进行综合分析。

Comprehensive analysis of orthologous protein domains using the HOPS database.

作者信息

Storm Christian E V, Sonnhammer Erik L L

机构信息

Center for Genomics and Bioinformatics, Karolinska Institutet, S-17177 Stockholm, Sweden.

出版信息

Genome Res. 2003 Oct;13(10):2353-62. doi: 10.1101/gr1305203.

Abstract

One of the most reliable methods for protein function annotation is to transfer experimentally known functions from orthologous proteins in other organisms. Most methods for identifying orthologs operate on a subset of organisms with a completely sequenced genome, and treat proteins as single-domain units. However, it is well known that proteins are often made up of several independent domains, and there is a wealth of protein sequences from genomes that are not completely sequenced. A comprehensive set of protein domain families is found in the Pfam database. We wanted to apply orthology detection to Pfam families, but first some issues needed to be addressed. First, orthology detection becomes impractical and unreliable when too many species are included. Second, shorter domains contain less information. It is therefore important to assess the quality of the orthology assignment and avoid very short domains altogether. We present a database of orthologous protein domains in Pfam called HOPS: Hierarchical grouping of Orthologous and Paralogous Sequences. Orthology is inferred in a hierarchic system of phylogenetic subgroups using ortholog bootstrapping. To avoid the frequent errors stemming from horizontally transferred genes in bacteria, the analysis is presently limited to eukaryotic genes. The results are accessible in the graphical browser NIFAS, a Java tool originally developed for analyzing phylogenetic relations within Pfam families. The method was tested on a set of curated orthologs with experimentally verified function. In comparison to tree reconciliation with a complete species tree, our approach finds significantly more orthologs in the test set. Examples for investigating gene fusions and domain recombination using HOPS are given.

摘要

蛋白质功能注释最可靠的方法之一是从其他生物体中的直系同源蛋白质转移实验已知的功能。大多数用于鉴定直系同源物的方法作用于具有完全测序基因组的生物体子集,并将蛋白质视为单结构域单元。然而,众所周知蛋白质通常由几个独立的结构域组成,并且存在来自未完全测序基因组的大量蛋白质序列。在Pfam数据库中发现了一套全面的蛋白质结构域家族。我们想将直系同源性检测应用于Pfam家族,但首先需要解决一些问题。首先,当包含太多物种时,直系同源性检测变得不切实际且不可靠。其次,较短的结构域包含的信息较少。因此,评估直系同源性分配的质量并完全避免非常短的结构域很重要。我们提出了一个名为HOPS的Pfam直系同源蛋白质结构域数据库:直系同源和旁系同源序列的层次分组。使用直系同源物自展法在系统发育亚组的层次系统中推断直系同源性。为了避免细菌中水平转移基因引起的频繁错误,目前的分析仅限于真核基因。结果可在图形浏览器NIFAS中获取,NIFAS是一个最初开发用于分析Pfam家族内系统发育关系的Java工具。该方法在一组具有经实验验证功能的精选直系同源物上进行了测试。与使用完整物种树的树调和相比,我们的方法在测试集中发现了明显更多的直系同源物。给出了使用HOPS研究基因融合和结构域重组的示例。

相似文献

4
Quantification of the elevated rate of domain rearrangements in metazoa.后生动物中结构域重排升高率的量化
J Mol Biol. 2007 Oct 5;372(5):1337-48. doi: 10.1016/j.jmb.2007.06.022. Epub 2007 Jun 15.
7
Swaps in protein sequences.蛋白质序列中的交换。
Proteins. 2002 Aug 1;48(2):377-87. doi: 10.1002/prot.10156.

引用本文的文献

2
Pan-Tetris: an interactive visualisation for Pan-genomes.泛基因组的Pan-Tetris交互式可视化工具
BMC Bioinformatics. 2015;16 Suppl 11(Suppl 11):S3. doi: 10.1186/1471-2105-16-S11-S3. Epub 2015 Aug 13.
4
Big data and other challenges in the quest for orthologs.大数据和其他挑战在寻找直系同源基因的过程中。
Bioinformatics. 2014 Nov 1;30(21):2993-8. doi: 10.1093/bioinformatics/btu492. Epub 2014 Jul 26.
7
Computational methods for Gene Orthology inference.基因直系同源推断的计算方法。
Brief Bioinform. 2011 Sep;12(5):379-91. doi: 10.1093/bib/bbr030. Epub 2011 Jun 19.

本文引用的文献

2
OrthoGUI: graphical presentation of Orthostrapper results.OrthoGUI:Orthostrapper结果的图形化展示。
Bioinformatics. 2002 Sep;18(9):1272-3. doi: 10.1093/bioinformatics/18.9.1272.
4
Algorithms for phylogenetic footprinting.系统发育足迹分析算法。
J Comput Biol. 2002;9(2):211-23. doi: 10.1089/10665270252935421.
5
The evolutionary position of nematodes.线虫的进化地位。
BMC Evol Biol. 2002 Apr 8;2:7. doi: 10.1186/1471-2148-2-7.
8
The Pfam protein families database.Pfam蛋白质家族数据库。
Nucleic Acids Res. 2002 Jan 1;30(1):276-80. doi: 10.1093/nar/30.1.276.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验