Richard Guy-Franck
Institut Pasteur, Department Genomes & Genetics, Paris, France
CNRS, Paris, France
The first eukaryotes emerged from their prokaryotic ancestors more than 1.5 billion years ago and rapidly spread over the planet, first in the ocean, later on as land animals, plants, and fungi. Taking advantage of an expanding genome complexity and flexibility, they invaded almost all known ecological niches, adapting their body plan, physiology, and metabolism to new environments. This increase in genome complexity came along with an increase in gene repertoire, mainly from molecular reassortment of existing protein domains, but sometimes from the capture of a piece of viral genome or of a transposon sequence. With increasing sequencing and computing powers, it has become possible to undertake deciphering eukaryotic genome contents to an unprecedented scale, collecting all genes belonging to a given species, aiming at compiling all essential and dispensable genes making eukaryotic life possible. In this chapter, eukaryotic core- and pangenomes concepts will be described, as well as notions of closed or open genomes. Among all eukaryotes presently sequenced, ascomycetous yeasts are arguably the most well-described clade and the pangenome of , , as well as species will be reviewed. For scientific and economical reasons, many plant genomes have been sequenced too and the gene content of soybean, cabbage, poplar, thale cress, rice, maize, and barley will be outlined. Planktonic life forms, such as , a chromalveolate or , a green alga, will be detailed and their pangenomes pictured. Mechanisms generating genetic diversity, such as interspecific hybridization, whole-genome duplications, segmental duplications, horizontal gene transfer, and single-gene duplication will be depicted and exemplified. Finally, computing approaches used to calculate core- and pangenome contents will be briefly described, as well as possible future directions in eukaryotic comparative genomics.
第一批真核生物在15亿多年前从其原核祖先中分化出来,并迅速在地球上扩散,最初是在海洋中,后来作为陆地动物、植物和真菌。利用不断增加的基因组复杂性和灵活性,它们几乎侵入了所有已知的生态位,使身体结构、生理和新陈代谢适应新环境。基因组复杂性的增加伴随着基因库的扩大,这主要源于现有蛋白质结构域的分子重排,但有时也源于一段病毒基因组或转座子序列的捕获。随着测序和计算能力的提高,以前所未有的规模解读真核生物基因组内容、收集属于特定物种的所有基因并旨在汇编使真核生物生命成为可能的所有必需和非必需基因已成为可能。在本章中,将描述真核生物的核心基因组和泛基因组概念,以及封闭或开放基因组的概念。在目前已测序的所有真核生物中,子囊菌酵母可以说是描述得最详尽的进化枝,将对酿酒酵母、粟酒裂殖酵母以及其他物种的泛基因组进行综述。出于科学和经济原因,许多植物基因组也已被测序,将概述大豆、卷心菜、杨树、拟南芥、水稻、玉米和大麦的基因含量。浮游生物形式,如一种定鞭藻或一种绿藻,将被详细介绍并描绘其泛基因组。将描述并举例说明产生遗传多样性的机制,如种间杂交、全基因组复制、片段重复、水平基因转移和单基因复制。最后,将简要描述用于计算核心基因组和泛基因组含量的计算方法,以及真核生物比较基因组学未来可能的发展方向。