Suppr超能文献

虹鳟低覆盖度全基因组测序数据的精确基因型推断。

Accurate genotype imputation from low-coverage whole-genome sequencing data of rainbow trout.

机构信息

United States Department of Agriculture, National Center for Cool and Cold Water Aquaculture, Agricultural Research Service, Kearneysville, WV 25430, USA.

Troutlodge Inc., Sumner, WA 98390, USA.

出版信息

G3 (Bethesda). 2024 Sep 4;14(9). doi: 10.1093/g3journal/jkae168.

Abstract

With the rapid and significant cost reduction of next-generation sequencing, low-coverage whole-genome sequencing (lcWGS), followed by genotype imputation, is becoming a cost-effective alternative to single-nucleotide polymorphism (SNP)-array genotyping. The objectives of this study were 2-fold: (1) construct a haplotype reference panel for genotype imputation from lcWGS data in rainbow trout (Oncorhynchus mykiss); and (2) evaluate the concordance between imputed genotypes and SNP-array genotypes in 2 breeding populations. Medium-coverage (12×) whole-genome sequences were obtained from a total of 410 fish representing 5 breeding populations with various spawning dates. The short-read sequences were mapped to the rainbow trout reference genome, and genetic variants were identified using GATK. After data filtering, 20,434,612 biallelic SNPs were retained. The reference panel was phased with SHAPEIT5 and was used as a reference to impute genotypes from lcWGS data employing GLIMPSE2. A total of 90 fish from the Troutlodge November breeding population were sequenced with an average coverage of 1.3×, and these fish were also genotyped with the Axiom 57K rainbow trout SNP array. The concordance between array-based genotypes and imputed genotypes was 99.1%. After downsampling the coverage to 0.5×, 0.2×, and 0.1×, the concordance between array-based genotypes and imputed genotypes was 98.7, 97.8, and 96.7%, respectively. In the USDA odd-year breeding population, the concordance between array-based genotypes and imputed genotypes was 97.8% for 109 fish downsampled to 0.5× coverage. Therefore, the reference haplotype panel reported in this study can be used to accurately impute genotypes from lcWGS data in rainbow trout breeding populations.

摘要

随着下一代测序成本的快速显著降低,低覆盖全基因组测序(lcWGS)结合基因型推断,正在成为单核苷酸多态性(SNP)-芯片基因分型的一种具有成本效益的替代方法。本研究的目的有两个:(1)构建虹鳟鱼(Oncorhynchus mykiss)lcWGS 数据基因型推断的单倍型参考面板;(2)评估 2 个育种群体中推断基因型与 SNP 芯片基因型的一致性。从代表不同产卵日期的 5 个育种群体的 410 条鱼中获得了中等覆盖度(12×)的全基因组序列。将短读序列映射到虹鳟鱼参考基因组上,并用 GATK 识别遗传变异。经过数据过滤,保留了 20,434,612 个双等位基因 SNP。使用 SHAPEIT5 对参考面板进行相位划分,并使用 GLIMPSE2 从 lcWGS 数据中推断基因型。共有 90 条来自 Troutlodge November 育种群体的鱼进行了测序,平均覆盖率为 1.3×,这些鱼还使用 Axiom 57K 虹鳟鱼 SNP 芯片进行了基因分型。基于芯片的基因型与推断基因型的一致性为 99.1%。在将覆盖度下采样到 0.5×、0.2×和 0.1×后,基于芯片的基因型与推断基因型的一致性分别为 98.7%、97.8%和 96.7%。在 USDA 奇数年份的育种群体中,109 条鱼的覆盖度下采样到 0.5×时,基于芯片的基因型与推断基因型的一致性为 97.8%。因此,本研究报告的参考单倍型面板可用于准确推断虹鳟鱼育种群体的 lcWGS 数据中的基因型。

相似文献

1
Accurate genotype imputation from low-coverage whole-genome sequencing data of rainbow trout.
G3 (Bethesda). 2024 Sep 4;14(9). doi: 10.1093/g3journal/jkae168.
4
Comparing low-pass sequencing and genotyping for trait mapping in pharmacogenetics.
BMC Genomics. 2021 Mar 20;22(1):197. doi: 10.1186/s12864-021-07508-2.
6
A cautionary tale of low-pass sequencing and imputation with respect to haplotype accuracy.
Genet Sel Evol. 2024 Jan 12;56(1):6. doi: 10.1186/s12711-024-00875-w.
9
A New Single Nucleotide Polymorphism Database for Rainbow Trout Generated Through Whole Genome Resequencing.
Front Genet. 2018 Apr 24;9:147. doi: 10.3389/fgene.2018.00147. eCollection 2018.
10
Best practices for analyzing imputed genotypes from low-pass sequencing in dogs.
Mamm Genome. 2022 Mar;33(1):213-229. doi: 10.1007/s00335-021-09914-z. Epub 2021 Sep 8.

引用本文的文献

本文引用的文献

1
Imputation of low-coverage sequencing data from 150,119 UK Biobank genomes.
Nat Genet. 2023 Jul;55(7):1088-1090. doi: 10.1038/s41588-023-01438-3. Epub 2023 Jun 29.
2
Accurate rare variant phasing of whole-genome and whole-exome sequencing data in the UK Biobank.
Nat Genet. 2023 Jul;55(7):1243-1249. doi: 10.1038/s41588-023-01415-w. Epub 2023 Jun 29.
5
Development of a High-Density 665 K SNP Array for Rainbow Trout Genome-Wide Genotyping.
Front Genet. 2022 Jul 18;13:941340. doi: 10.3389/fgene.2022.941340. eCollection 2022.
6
Genome-wide association studies for egg quality traits in White Leghorn layers using low-pass sequencing and SNP chip data.
J Anim Breed Genet. 2022 Jul;139(4):380-397. doi: 10.1111/jbg.12679. Epub 2022 Apr 11.
7
Rapid genotype imputation from sequence with reference panels.
Nat Genet. 2021 Jul;53(7):1104-1111. doi: 10.1038/s41588-021-00877-0. Epub 2021 Jun 3.
8
How array design creates SNP ascertainment bias.
PLoS One. 2021 Mar 30;16(3):e0245178. doi: 10.1371/journal.pone.0245178. eCollection 2021.
9
Low-coverage sequencing cost-effectively detects known and novel variation in underrepresented populations.
Am J Hum Genet. 2021 Apr 1;108(4):656-668. doi: 10.1016/j.ajhg.2021.03.012. Epub 2021 Mar 25.
10
Identification of High-Confidence Structural Variants in Domesticated Rainbow Trout Using Whole-Genome Sequencing.
Front Genet. 2021 Feb 25;12:639355. doi: 10.3389/fgene.2021.639355. eCollection 2021.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验