Department of Molecular Genetics, Weizmann Institute of Science, 76100, Rehovot, Israel.
HUGO Gene Nomenclature Committee, European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK.
BMC Evol Biol. 2020 Apr 15;20(1):42. doi: 10.1186/s12862-020-01607-6.
Olfactory receptors (ORs) are G protein-coupled receptors with a crucial role in odor detection. A typical mammalian genome harbors ~ 1000 OR genes and pseudogenes; however, different gene duplication/deletion events have occurred in each species, resulting in complex orthology relationships. While the human OR nomenclature is widely accepted and based on phylogenetic classification into 18 families and further into subfamilies, for other mammals different and multiple nomenclature systems are currently in use, thus concealing important evolutionary and functional insights.
Here, we describe the Mutual Maximum Similarity (MMS) algorithm, a systematic classifier for assigning a human-centric nomenclature to any OR gene based on inter-species hierarchical pairwise similarities. MMS was applied to the OR repertoires of seven mammals and zebrafish. Altogether, we assigned symbols to 10,249 ORs. This nomenclature is supported by both phylogenetic and synteny analyses. The availability of a unified nomenclature provides a framework for diverse studies, where textual symbol comparison allows immediate identification of potential ortholog groups as well as species-specific expansions/deletions; for example, Or52e5 and Or52e5b represent a rat-specific duplication of OR52E5. Another example is the complete absence of OR subfamily OR6Z among primate OR symbols. In other mammals, OR6Z members are located in one genomic cluster, suggesting a large deletion in the great ape lineage. An additional 14 mammalian OR subfamilies are missing from the primate genomes. While in chimpanzee 87% of the symbols were identical to human symbols, this number decreased to ~ 50% in dog and cow and to ~ 30% in rodents, reflecting the adaptive changes of the OR gene superfamily across diverse ecological niches. Application of the proposed nomenclature to zebrafish revealed similarity to mammalian ORs that could not be detected from the current zebrafish olfactory receptor gene nomenclature.
We have consolidated a unified standard nomenclature system for the vertebrate OR superfamily. The new nomenclature system will be applied to cow, horse, dog and chimpanzee by the Vertebrate Gene Nomenclature Committee and its implementation is currently under consideration by other relevant species-specific nomenclature committees.
嗅觉受体(ORs)是 G 蛋白偶联受体,在气味检测中起着至关重要的作用。典型的哺乳动物基因组中约有 1000 个 OR 基因和假基因;然而,不同的基因重复/缺失事件在每个物种中都发生了,导致了复杂的同源关系。虽然人类 OR 命名法被广泛接受,并基于系统发育分类为 18 个家族,进一步分为亚家族,但对于其他哺乳动物,目前正在使用不同的和多种命名系统,从而掩盖了重要的进化和功能见解。
在这里,我们描述了互最大相似性(MMS)算法,这是一种系统分类器,用于根据物种间的层次成对相似性,将人类中心命名法分配给任何 OR 基因。MMS 被应用于七种哺乳动物和斑马鱼的 OR 库。总的来说,我们给 10249 个 OR 分配了符号。这种命名法得到了系统发育和同线性分析的支持。统一命名法的可用性为各种研究提供了一个框架,其中文本符号比较允许立即识别潜在的直系同源群以及物种特异性的扩展/缺失;例如,Or52e5 和 Or52e5b 代表大鼠 OR52E5 的特异性重复。另一个例子是灵长类动物 OR 符号中完全没有 OR6Z 亚家族。在其他哺乳动物中,OR6Z 成员位于一个基因组簇中,这表明在大猿谱系中存在一个大的缺失。在灵长类动物基因组中还缺失了另外 14 个哺乳动物 OR 亚家族。虽然在黑猩猩中 87%的符号与人类符号相同,但在狗和牛中这个数字下降到约 50%,在啮齿动物中下降到约 30%,反映了 OR 基因超家族在不同生态位中的适应性变化。将提议的命名法应用于斑马鱼揭示了与哺乳动物 OR 的相似性,而从当前的斑马鱼嗅觉受体基因命名法中无法检测到这些相似性。
我们已经整合了一个统一的标准命名系统,用于脊椎动物 OR 超家族。新的命名系统将由脊椎动物基因命名委员会应用于奶牛、马、狗和黑猩猩,其实施目前正在考虑由其他相关的特定物种命名委员会进行。