School of Biotechnology and Biomolecular Sciences, UNSW Sydney, Sydney, NSW 2052, Australia.
School of Engineering, UNSW Sydney, Sydney, NSW 2052, Australia.
Mol Biol Evol. 2024 Jul 3;41(7). doi: 10.1093/molbev/msae146.
Ultraconserved elements were discovered two decades ago, arbitrarily defined as sequences that are identical over a length ≥ 200 bp in the human, mouse, and rat genomes. The definition was subsequently extended to sequences ≥ 100 bp identical in at least three of five mammalian genomes (including dog and cow), and shown to have undergone rapid expansion from ancestors in fish and strong negative selection in birds and mammals. Since then, many more genomes have become available, allowing better definition and more thorough examination of ultraconserved element distribution and evolutionary history. We developed a fast and flexible analytical pipeline for identifying ultraconserved elements in multiple genomes, dedUCE, which allows manipulation of minimum length, sequence identity, and number of species with a detectable ultraconserved element according to specified parameters. We suggest an updated definition of ultraconserved elements as sequences ≥ 100 bp and ≥97% sequence identity in ≥50% of placental mammal orders (12,813 ultraconserved elements). By mapping ultraconserved elements to ∼200 species, we find that placental ultraconserved elements appeared early in vertebrate evolution, well before land colonization, suggesting that the evolutionary pressures driving ultraconserved element selection were present in aquatic environments in the Cambrian-Devonian periods. Most (>90%) ultraconserved elements likely appeared after the divergence of gnathostomes from jawless predecessors, were largely established in sequence identity by early Sarcopterygii evolution-before the divergence of lobe-finned fishes from tetrapods-and became near fixed in the amniotes. Ultraconserved elements are mainly located in the introns of protein-coding and noncoding genes involved in neurological and skeletomuscular development, enriched in regulatory elements, and dynamically expressed throughout embryonic development.
超保守元件是二十年前发现的,任意定义为在人类、小鼠和大鼠基因组中长度≥200bp 的相同序列。该定义随后扩展到至少五种哺乳动物基因组(包括狗和牛)中至少三个相同的长度≥100bp 的序列,并显示出从鱼类祖先中快速扩张和在鸟类和哺乳动物中强烈的负选择。从那时起,更多的基因组变得可用,从而能够更好地定义和更彻底地检查超保守元件的分布和进化历史。我们开发了一种快速灵活的分析管道,用于在多个基因组中识别超保守元件,该管道称为 dedUCE,可以根据指定的参数来操纵最小长度、序列同一性和具有可检测超保守元件的物种数量。我们建议将超保守元件的定义更新为长度≥100bp 且≥97%序列同一性的序列,在≥50%的胎盘哺乳动物目中存在(12813 个超保守元件)。通过将超保守元件映射到约 200 个物种,我们发现胎盘超保守元件出现在脊椎动物进化的早期,远早于陆地殖民化,这表明驱动超保守元件选择的进化压力在寒武纪-泥盆纪的水生环境中就已经存在。大多数(>90%)超保守元件可能出现在无颌类祖先的颌骨分化之后,在早期肉鳍鱼类进化中(在四足动物与肺鱼分化之前)就已经在序列同一性上得到了很大的建立,并在羊膜动物中接近固定。超保守元件主要位于涉及神经和骨骼肌肉发育的蛋白质编码和非编码基因的内含子中,富含调控元件,并在整个胚胎发育过程中动态表达。