基于在纯杜洛克群体中对低覆盖度全基因组序列变异体进行选择性连锁不平衡修剪的基因组预测。

Genomic prediction based on selective linkage disequilibrium pruning of low-coverage whole-genome sequence variants in a pure Duroc population.

机构信息

State Key Laboratory of Animal Biotech Breeding, College of Biological Sciences, China Agricultural University, Beijing, China.

National Research Facility for Phenotypic and Genotypic Analysis of Model Animals (Beijing), China Agricultural University, Beijing, China.

出版信息

Genet Sel Evol. 2023 Oct 18;55(1):72. doi: 10.1186/s12711-023-00843-w.

DOI:10.1186/s12711-023-00843-w

PMID:37853325

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10583454/

Abstract

BACKGROUND

Although the accumulation of whole-genome sequencing (WGS) data has accelerated the identification of mutations underlying complex traits, its impact on the accuracy of genomic predictions is limited. Reliable genotyping data and pre-selected beneficial loci can be used to improve prediction accuracy. Previously, we reported a low-coverage sequencing genotyping method that yielded 11.3 million highly accurate single-nucleotide polymorphisms (SNPs) in pigs. Here, we introduce a method termed selective linkage disequilibrium pruning (SLDP), which refines the set of SNPs that show a large gain during prediction of complex traits using whole-genome SNP data.

RESULTS

We used the SLDP method to identify and select markers among millions of SNPs based on genome-wide association study (GWAS) prior information. We evaluated the performance of SLDP with respect to three real traits and six simulated traits with varying genetic architectures using two representative models (genomic best linear unbiased prediction and BayesR) on samples from 3579 Duroc boars. SLDP was determined by testing 180 combinations of two core parameters (GWAS P-value thresholds and linkage disequilibrium r). The parameters for each trait were optimized in the training population by five fold cross-validation and then tested in the validation population. Similar to previous GWAS prior-based methods, the performance of SLDP was mainly affected by the genetic architecture of the traits analyzed. Specifically, SLDP performed better for traits controlled by major quantitative trait loci (QTL) or a small number of quantitative trait nucleotides (QTN). Compared with two commercial SNP chips, genotyping-by-sequencing data, and an unselected whole-genome SNP panel, the SLDP strategy led to significant improvements in prediction accuracy, which ranged from 0.84 to 3.22% for real traits controlled by major or moderate QTL and from 1.23 to 11.47% for simulated traits controlled by a small number of QTN.

CONCLUSIONS

The SLDP marker selection method can be incorporated into mainstream prediction models to yield accuracy improvements for traits with a relatively simple genetic architecture, however, it has no significant advantage for traits not controlled by major QTL. The main factors that affect its performance are the genetic architecture of traits and the reliability of GWAS prior information. Our findings can facilitate the application of WGS-based genomic selection.

摘要

背景

尽管全基因组测序（WGS）数据的积累加速了对复杂性状相关突变的鉴定，但它对基因组预测准确性的影响有限。可靠的基因型数据和预先选择的有益位点可用于提高预测准确性。此前，我们报道了一种低覆盖测序基因分型方法，该方法在猪中产生了 1130 万个高度准确的单核苷酸多态性（SNP）。在这里，我们引入了一种称为选择性连锁不平衡修剪（SLDP）的方法，该方法利用全基因组 SNP 数据，在预测复杂性状时，对表现出较大增益的 SNP 集进行了精细筛选。

结果

我们使用 SLDP 方法根据全基因组关联研究（GWAS）先验信息，从数百万个 SNP 中识别和选择标记。我们使用两个代表性模型（基因组最佳线性无偏预测和 BayesR），在来自 3579 头杜洛克猪的样本中，针对三个真实性状和六个具有不同遗传结构的模拟性状，评估了 SLDP 的性能。通过测试 180 种两种核心参数（GWAS P 值阈值和连锁不平衡 r）的组合，确定了 SLDP。通过在训练群体中进行五次交叉验证来优化每个性状的参数，然后在验证群体中进行测试。与之前基于 GWAS 先验的方法类似，SLDP 的性能主要受分析性状的遗传结构的影响。具体而言，SLDP 对由主要数量性状位点（QTL）或少数数量性状核苷酸（QTN）控制的性状表现更好。与两种商业 SNP 芯片、测序基因分型数据和未选择的全基因组 SNP 面板相比，SLDP 策略显著提高了预测准确性，对于由主要或中度 QTL 控制的真实性状，预测准确性提高了 0.84%至 3.22%，对于由少数 QTN 控制的模拟性状，预测准确性提高了 1.23%至 11.47%。

结论

SLDP 标记选择方法可被纳入主流预测模型，从而提高遗传结构相对简单的性状的准确性，但是，对于不受主要 QTL 控制的性状，它没有明显的优势。影响其性能的主要因素是性状的遗传结构和 GWAS 先验信息的可靠性。我们的研究结果可以促进基于 WGS 的基因组选择的应用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d0fb/10583454/eb930efccfe5/12711_2023_843_Fig1_HTML.jpg

相似文献

Genomic prediction based on selective linkage disequilibrium pruning of low-coverage whole-genome sequence variants in a pure Duroc population.

Genet Sel Evol. 2023 Oct 18;55(1):72. doi: 10.1186/s12711-023-00843-w.

Using imputation-based whole-genome sequencing data to improve the accuracy of genomic prediction for combined populations in pigs.

Genet Sel Evol. 2019 Oct 21;51(1):58. doi: 10.1186/s12711-019-0500-8.

Accuracy of prediction of simulated polygenic phenotypes and their underlying quantitative trait loci genotypes using real or imputed whole-genome markers in cattle.

Genet Sel Evol. 2015 Dec 23;47:99. doi: 10.1186/s12711-015-0179-4.

Genome-wide association study and prediction of genomic breeding values for fatty-acid composition in Korean Hanwoo cattle using a high-density single-nucleotide polymorphism array.

J Anim Sci. 2018 Sep 29;96(10):4063-4075. doi: 10.1093/jas/sky280.

Genomic prediction based on preselected single-nucleotide polymorphisms from genome-wide association study and imputed whole-genome sequence data annotation for growth traits in Duroc pigs.

Evol Appl. 2024 Feb 15;17(2):e13651. doi: 10.1111/eva.13651. eCollection 2024 Feb.

Genome-wide association studies for agronomical traits in a world wide spring barley collection.

BMC Plant Biol. 2012 Jan 27;12:16. doi: 10.1186/1471-2229-12-16.

Genomic prediction based on selected variants from imputed whole-genome sequence data in Australian sheep populations.

Genet Sel Evol. 2019 Dec 5;51(1):72. doi: 10.1186/s12711-019-0514-2.

Genomic Prediction Based on SNP Functional Annotation Using Imputed Whole-Genome Sequence Data in Korean Hanwoo Cattle.

Front Genet. 2021 Jan 21;11:603822. doi: 10.3389/fgene.2020.603822. eCollection 2020.

Accounting for trait architecture in genomic predictions of US Holstein cattle using a weighted realized relationship matrix.

Genet Sel Evol. 2015 Apr 2;47(1):24. doi: 10.1186/s12711-015-0100-1.

Genomic prediction using preselected DNA variants from a GWAS with whole-genome sequence data in Holstein-Friesian cattle.

Genet Sel Evol. 2016 Dec 1;48(1):95. doi: 10.1186/s12711-016-0274-1.

引用本文的文献

Genetic parameters and genomic prediction of egg production traits in ducks.

Poult Sci. 2025 Jul 3;104(10):105510. doi: 10.1016/j.psj.2025.105510.

Genome Selection and Genome-Wide Association Analyses for Litter Size Traits in Large White Pigs.

Animals (Basel). 2025 Jun 11;15(12):1724. doi: 10.3390/ani15121724.

Genomic prediction with kinship-based multiple kernel learning produces hypothesis on the underlying inheritance mechanisms of phenotypic traits.

Genome Biol. 2025 Apr 4;26(1):84. doi: 10.1186/s13059-025-03544-3.

Improvement of the accuracy of breeding value prediction for egg production traits in Muscovy duck using low-coverage whole-genome sequence data.

Poult Sci. 2025 Feb;104(2):104812. doi: 10.1016/j.psj.2025.104812. Epub 2025 Jan 15.

Improvement in genomic prediction of maize with prior gene ontology information depends on traits and environmental conditions.

Plant Genome. 2025 Mar;18(1):e20553. doi: 10.1002/tpg2.20553.

Integrating multi-layered biological priors to improve genomic prediction accuracy in beef cattle.

Biol Direct. 2024 Dec 31;19(1):147. doi: 10.1186/s13062-024-00574-y.

GWAS Enhances Genomic Prediction Accuracy of Caviar Yield, Caviar Color and Body Weight Traits in Sturgeons Using Whole-Genome Sequencing Data.

Int J Mol Sci. 2024 Sep 9;25(17):9756. doi: 10.3390/ijms25179756.

Biologically meaningful genome interpretation models to address data underdetermination for the leaf and seed ionome prediction in Arabidopsis thaliana.

Sci Rep. 2024 Jun 8;14(1):13188. doi: 10.1038/s41598-024-63855-6.

本文引用的文献

Genome-wide association study reveals a genomic region on SSC7 simultaneously associated with backfat thickness, skin thickness and carcass length in a Large White × Tongcheng advanced generation intercross resource population.

Anim Genet. 2023 Apr;54(2):216-219. doi: 10.1111/age.13285. Epub 2022 Dec 13.

Population genetic structure analysis and identification of backfat thickness loci of Chinese synthetic Yunan pigs.

Front Genet. 2022 Nov 9;13:1039838. doi: 10.3389/fgene.2022.1039838. eCollection 2022.

Genomic prediction with whole-genome sequence data in intensely selected pig lines.

Genet Sel Evol. 2022 Sep 24;54(1):65. doi: 10.1186/s12711-022-00756-0.

Haplotype genomic prediction of phenotypic values based on chromosome distance and gene boundaries using low-coverage sequencing in Duroc pigs.

Genet Sel Evol. 2021 Oct 7;53(1):78. doi: 10.1186/s12711-021-00661-y.

Accelerated deciphering of the genetic architecture of agricultural economic traits in pigs using a low-coverage whole-genome sequencing strategy.

Gigascience. 2021 Jul 20;10(7). doi: 10.1093/gigascience/giab048.

Long-term comparison between index selection and optimal independent culling in plant breeding programs with genomic prediction.

PLoS One. 2021 May 10;16(5):e0235554. doi: 10.1371/journal.pone.0235554. eCollection 2021.

Multiple ancestral haplotypes harboring regulatory mutations cumulatively contribute to a QTL affecting chicken growth traits.

Commun Biol. 2020 Aug 28;3(1):472. doi: 10.1038/s42003-020-01199-3.

KAML: improving genomic prediction accuracy of complex traits using machine learning determined parameters.

Genome Biol. 2020 Jun 17;21(1):146. doi: 10.1186/s13059-020-02052-w.

Use of whole-genome sequence data and novel genomic selection strategies to improve selection for age at puberty in tropically-adapted beef heifers.

Genet Sel Evol. 2020 May 27;52(1):28. doi: 10.1186/s12711-020-00547-5.

Symposium review: How to implement genomic selection.

J Dairy Sci. 2020 Jun;103(6):5291-5301. doi: 10.3168/jds.2019-17684. Epub 2020 Apr 22.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于在纯杜洛克群体中对低覆盖度全基因组序列变异体进行选择性连锁不平衡修剪的基因组预测。

Genomic prediction based on selective linkage disequilibrium pruning of low-coverage whole-genome sequence variants in a pure Duroc population.

机构信息

State Key Laboratory of Animal Biotech Breeding, College of Biological Sciences, China Agricultural University, Beijing, China.

National Research Facility for Phenotypic and Genotypic Analysis of Model Animals (Beijing), China Agricultural University, Beijing, China.