USDA-ARS National Center for Cool and Cold Water Aquaculture, Kearneysville, WV 25430, USA.
Centro de Investigaciones Biomédicas, Universidade de Vigo, Campus Universitario Lagoas Marcosende, 36310 Vigo, España.
G3 (Bethesda). 2021 Apr 15;11(4). doi: 10.1093/g3journal/jkab052.
Currently, there is still a need to improve the contiguity of the rainbow trout reference genome and to use multiple genetic backgrounds that will represent the genetic diversity of this species. The Arlee doubled haploid line was originated from a domesticated hatchery strain that was originally collected from the northern California coast. The Canu pipeline was used to generate the Arlee line genome de-novo assembly from high coverage PacBio long-reads sequence data. The assembly was further improved with Bionano optical maps and Hi-C proximity ligation sequence data to generate 32 major scaffolds corresponding to the karyotype of the Arlee line (2 N = 64). It is composed of 938 scaffolds with N50 of 39.16 Mb and a total length of 2.33 Gb, of which ∼95% was in 32 chromosome sequences with only 438 gaps between contigs and scaffolds. In rainbow trout the haploid chromosome number can vary from 29 to 32. In the Arlee karyotype the haploid chromosome number is 32 because chromosomes Omy04, 14 and 25 are divided into six acrocentric chromosomes. Additional structural variations that were identified in the Arlee genome included the major inversions on chromosomes Omy05 and Omy20 and additional 15 smaller inversions that will require further validation. This is also the first rainbow trout genome assembly that includes a scaffold with the sex-determination gene (sdY) in the chromosome Y sequence. The utility of this genome assembly is shown through the improved annotation of the duplicated genome loci that harbor the IGH genes on chromosomes Omy12 and Omy13.
目前,仍需要提高虹鳟参考基因组的连续性,并使用能够代表该物种遗传多样性的多个遗传背景。Arlee 双单体系起源于一个经过驯化的孵化场品系,最初是从加利福尼亚州北部海岸收集的。使用 Canu 管道从高覆盖率 PacBio 长读序列数据生成 Arlee 系基因组从头组装。通过 Bionano 光学图谱和 Hi-C 近连接测序数据进一步改进组装,生成 32 个与 Arlee 系染色体组型相对应的主要支架(2N=64)。它由 938 个支架组成,N50 为 39.16Mb,总长度为 2.33Gb,其中约 95%在 32 条染色体序列中,只有 438 个缺口在 contigs 和 scaffolds 之间。在虹鳟中,单倍体染色体数可以从 29 到 32 不等。在 Arlee 染色体组中,单倍体染色体数为 32,因为染色体 Omy04、14 和 25 被分为六个着丝粒染色体。在 Arlee 基因组中还鉴定到了其他结构变异,包括染色体 Omy05 和 Omy20 上的主要倒位以及另外 15 个较小的倒位,这需要进一步验证。这也是第一个包含在染色体 Y 序列中的性别决定基因(sdY)的虹鳟基因组组装。通过改进在染色体 Omy12 和 Omy13 上的 IGH 基因所在的重复基因组位点的注释,展示了这个基因组组装的实用性。