Prachumwat Anuphap, Li Wen-Hsiung
Department of Ecology and Evolution, University of Chicago, Chicago, Illinois 60637, USA.
Genome Res. 2008 Feb;18(2):221-32. doi: 10.1101/gr.7046608. Epub 2007 Dec 14.
Where did vertebrate genes come from? Here we address this question by analyzing eight completely sequenced land vertebrate genomes and six completely sequenced invertebrate genomes. Approximately 70% of the vertebrate genes can be found in the six invertebrate genomes with the standard homology search criteria (denoted as V.MCL), another approximately 6% can be found with relaxed search criteria, and an additional approximately 2% can be found in sequenced fungal and bacterial genomes. Thus, a substantial proportion of vertebrate genes (approximately 22%) cannot be found in the nonvertebrate genomes studied (denoted as Vonly). Interestingly, genes in Vonly are predominantly singletons, while the majority of genes in the other three groups belong to gene families. The proteins of Vonly tend to evolve faster than those of V.MCL. Surprisingly, in many cases the family sizes in V.MCL are only as large as or even smaller than their counterparts in the invertebrates, contrary to the general perception of a larger family size in vertebrates. Interestingly, in comparison with the family size in invertebrates, vertebrate gene families involved in regulation, signal transduction, transcription, protein transport, and protein modification tend to be expanded, whereas those involved in metabolic processes tend to be contracted. Furthermore, for almost all of the functional categories with family size expansion in vertebrates, the number of gene types (i.e., the number of singletons plus the number of gene families) tends to be over-represented in Vonly, but under-represented in V.MCL. Our study suggests that gene function is a major determinant of gene family size.
脊椎动物的基因来自何处?在此,我们通过分析八个已完全测序的陆地脊椎动物基因组和六个已完全测序的无脊椎动物基因组来探讨这个问题。按照标准同源性搜索标准(记为V.MCL),约70%的脊椎动物基因可在六个无脊椎动物基因组中找到,另外约6%可在放宽搜索标准时找到,还有约2%可在已测序的真菌和细菌基因组中找到。因此,相当一部分脊椎动物基因(约22%)在所研究的非脊椎动物基因组中找不到(记为Vonly)。有趣的是,Vonly中的基因主要是单拷贝基因,而其他三组中的大多数基因属于基因家族。Vonly中的蛋白质往往比V.MCL中的蛋白质进化得更快。令人惊讶的是,在许多情况下,V.MCL中的基因家族大小仅与无脊椎动物中的相当,甚至更小,这与通常认为脊椎动物基因家族更大的观念相反。有趣的是,与无脊椎动物的基因家族大小相比,脊椎动物中参与调控、信号转导、转录、蛋白质运输和蛋白质修饰的基因家族往往会扩张,而参与代谢过程的基因家族则往往会收缩。此外,对于脊椎动物中几乎所有家族大小扩张的功能类别,基因类型的数量(即单拷贝基因数量加上基因家族数量)在Vonly中往往过度代表,但在V.MCL中则代表性不足。我们的研究表明,基因功能是基因家族大小的主要决定因素。