Department of Zoology, University of Oxford, Oxford, UK.
School of Biological Sciences, University of Essex, Colchester, UK.
Proc Biol Sci. 2017 Oct 11;284(1864). doi: 10.1098/rspb.2017.1357.
Analysis of genome sequences within a phylogenetic context can give insight into the mode and tempo of gene and protein evolution, including inference of gene ages. This can reveal whether new genes arose on particular evolutionary lineages and were recruited for new functional roles. Here, we apply MCL clustering with all-versus-all reciprocal BLASTP to identify and phylogenetically date 'Homology Groups' among vertebrate proteins. Homology Groups include new genes and highly divergent duplicate genes. Focusing on the origin of the placental mammals within the Eutheria, we identify 357 novel Homology Groups that arose on the stem lineage of Placentalia, 87 of which are deduced to play core roles in mammalian biology as judged by extensive retention in evolution. We find the human homologues of novel eutherian genes are enriched for expression in preimplantation embryo, brain, and testes, and enriched for functions in keratinization, reproductive development, and the immune system.
在系统发育背景下分析基因组序列可以深入了解基因和蛋白质的进化模式和速度,包括基因年龄的推断。这可以揭示新基因是否出现在特定的进化谱系中,并被招募到新的功能角色中。在这里,我们应用 MCL 聚类和所有与所有的反向 BLASTP 来识别和系统发育地确定脊椎动物蛋白中的“同源群”。同源群包括新基因和高度分化的重复基因。我们专注于胎盘哺乳动物在真兽类中的起源,鉴定出 357 个新的同源群,这些同源群出现在胎盘类的主干谱系上,其中 87 个被推断在哺乳动物生物学中扮演核心角色,因为它们在进化中得到了广泛的保留。我们发现,新型真兽类基因的人类同源物在着床前胚胎、大脑和睾丸中的表达丰富,并且在角蛋白化、生殖发育和免疫系统的功能中富集。