United States Department of Agriculture, National Center for Cool and Cold Water Aquaculture, Agricultural Research Service, Kearneysville, WV 25430, USA.
Troutlodge Inc., Sumner, WA 98390, USA.
G3 (Bethesda). 2024 Sep 4;14(9). doi: 10.1093/g3journal/jkae168.
With the rapid and significant cost reduction of next-generation sequencing, low-coverage whole-genome sequencing (lcWGS), followed by genotype imputation, is becoming a cost-effective alternative to single-nucleotide polymorphism (SNP)-array genotyping. The objectives of this study were 2-fold: (1) construct a haplotype reference panel for genotype imputation from lcWGS data in rainbow trout (Oncorhynchus mykiss); and (2) evaluate the concordance between imputed genotypes and SNP-array genotypes in 2 breeding populations. Medium-coverage (12×) whole-genome sequences were obtained from a total of 410 fish representing 5 breeding populations with various spawning dates. The short-read sequences were mapped to the rainbow trout reference genome, and genetic variants were identified using GATK. After data filtering, 20,434,612 biallelic SNPs were retained. The reference panel was phased with SHAPEIT5 and was used as a reference to impute genotypes from lcWGS data employing GLIMPSE2. A total of 90 fish from the Troutlodge November breeding population were sequenced with an average coverage of 1.3×, and these fish were also genotyped with the Axiom 57K rainbow trout SNP array. The concordance between array-based genotypes and imputed genotypes was 99.1%. After downsampling the coverage to 0.5×, 0.2×, and 0.1×, the concordance between array-based genotypes and imputed genotypes was 98.7, 97.8, and 96.7%, respectively. In the USDA odd-year breeding population, the concordance between array-based genotypes and imputed genotypes was 97.8% for 109 fish downsampled to 0.5× coverage. Therefore, the reference haplotype panel reported in this study can be used to accurately impute genotypes from lcWGS data in rainbow trout breeding populations.
随着下一代测序成本的快速显著降低,低覆盖全基因组测序(lcWGS)结合基因型推断,正在成为单核苷酸多态性(SNP)-芯片基因分型的一种具有成本效益的替代方法。本研究的目的有两个:(1)构建虹鳟鱼(Oncorhynchus mykiss)lcWGS 数据基因型推断的单倍型参考面板;(2)评估 2 个育种群体中推断基因型与 SNP 芯片基因型的一致性。从代表不同产卵日期的 5 个育种群体的 410 条鱼中获得了中等覆盖度(12×)的全基因组序列。将短读序列映射到虹鳟鱼参考基因组上,并用 GATK 识别遗传变异。经过数据过滤,保留了 20,434,612 个双等位基因 SNP。使用 SHAPEIT5 对参考面板进行相位划分,并使用 GLIMPSE2 从 lcWGS 数据中推断基因型。共有 90 条来自 Troutlodge November 育种群体的鱼进行了测序,平均覆盖率为 1.3×,这些鱼还使用 Axiom 57K 虹鳟鱼 SNP 芯片进行了基因分型。基于芯片的基因型与推断基因型的一致性为 99.1%。在将覆盖度下采样到 0.5×、0.2×和 0.1×后,基于芯片的基因型与推断基因型的一致性分别为 98.7%、97.8%和 96.7%。在 USDA 奇数年份的育种群体中,109 条鱼的覆盖度下采样到 0.5×时,基于芯片的基因型与推断基因型的一致性为 97.8%。因此,本研究报告的参考单倍型面板可用于准确推断虹鳟鱼育种群体的 lcWGS 数据中的基因型。