• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用稀疏选择指数进行最优育种值预测。

Optimal breeding-value prediction using a sparse selection index.

机构信息

Department of Plant, Soil and Microbial Sciences, Michigan State University, East Lansing, MI 48824, USA.

Department of Epidemiology and Biostatistics, Michigan State University, East Lansing, MI 48824, USA.

出版信息

Genetics. 2021 May 17;218(1). doi: 10.1093/genetics/iyab030.

DOI:10.1093/genetics/iyab030
PMID:33748861
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8128408/
Abstract

Genomic prediction uses DNA sequences and phenotypes to predict genetic values. In homogeneous populations, theory indicates that the accuracy of genomic prediction increases with sample size. However, differences in allele frequencies and linkage disequilibrium patterns can lead to heterogeneity in SNP effects. In this context, calibrating genomic predictions using a large, potentially heterogeneous, training data set may not lead to optimal prediction accuracy. Some studies tried to address this sample size/homogeneity trade-off using training set optimization algorithms; however, this approach assumes that a single training data set is optimum for all individuals in the prediction set. Here, we propose an approach that identifies, for each individual in the prediction set, a subset from the training data (i.e., a set of support points) from which predictions are derived. The methodology that we propose is a sparse selection index (SSI) that integrates selection index methodology with sparsity-inducing techniques commonly used for high-dimensional regression. The sparsity of the resulting index is controlled by a regularization parameter (λ); the G-Best Linear Unbiased Predictor (G-BLUP) (the prediction method most commonly used in plant and animal breeding) appears as a special case which happens when λ = 0. In this study, we present the methodology and demonstrate (using two wheat data sets with phenotypes collected in 10 different environments) that the SSI can achieve significant (anywhere between 5 and 10%) gains in prediction accuracy relative to the G-BLUP.

摘要

基因组预测利用 DNA 序列和表型来预测遗传值。在同质群体中,理论表明基因组预测的准确性随着样本量的增加而提高。然而,等位基因频率和连锁不平衡模式的差异会导致 SNP 效应的异质性。在这种情况下,使用大型、潜在异质的训练数据集校准基因组预测可能不会导致最佳的预测准确性。一些研究试图使用训练集优化算法来解决样本大小/同质性的权衡问题;然而,这种方法假设对于预测集中的所有个体,单个训练数据集是最优的。在这里,我们提出了一种方法,为预测集中的每个个体从训练数据中识别出一个子集(即一组支持点),从中得出预测。我们提出的方法是稀疏选择指数 (SSI),它将选择指数方法与常用于高维回归的稀疏诱导技术相结合。由此产生的索引的稀疏性由正则化参数 (λ) 控制;GBest 线性无偏预测器 (G-BLUP)(植物和动物育种中最常用的预测方法)是当 λ = 0 时出现的一个特殊情况。在这项研究中,我们介绍了该方法,并通过两个在 10 个不同环境中收集表型的小麦数据集进行了演示,证明了 SSI 可以在预测准确性方面相对于 G-BLUP 获得显著的(5%到 10%之间的任意增益)提高。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6b5a/8128408/949cba379cbd/iyab030f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6b5a/8128408/98214a9533bc/iyab030f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6b5a/8128408/4033afd05234/iyab030f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6b5a/8128408/385557a066be/iyab030f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6b5a/8128408/1e55663dc94f/iyab030f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6b5a/8128408/949cba379cbd/iyab030f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6b5a/8128408/98214a9533bc/iyab030f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6b5a/8128408/4033afd05234/iyab030f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6b5a/8128408/385557a066be/iyab030f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6b5a/8128408/1e55663dc94f/iyab030f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6b5a/8128408/949cba379cbd/iyab030f5.jpg

相似文献

1
Optimal breeding-value prediction using a sparse selection index.利用稀疏选择指数进行最优育种值预测。
Genetics. 2021 May 17;218(1). doi: 10.1093/genetics/iyab030.
2
Resource allocation for maximizing prediction accuracy and genetic gain of genomic selection in plant breeding: a simulation experiment.最大化植物育种中基因组选择预测准确性和遗传增益的资源分配:模拟实验。
G3 (Bethesda). 2013 Mar;3(3):481-91. doi: 10.1534/g3.112.004911. Epub 2013 Mar 1.
3
Accuracy of Genomic Prediction in Synthetic Populations Depending on the Number of Parents, Relatedness, and Ancestral Linkage Disequilibrium.取决于亲本数量、亲缘关系和祖先连锁不平衡的合成群体中基因组预测的准确性。
Genetics. 2017 Jan;205(1):441-454. doi: 10.1534/genetics.116.193243. Epub 2016 Nov 9.
4
The effects of demography and long-term selection on the accuracy of genomic prediction with sequence data.人口统计学和长期选择对基于序列数据的基因组预测准确性的影响。
Genetics. 2014 Dec;198(4):1671-84. doi: 10.1534/genetics.114.168344. Epub 2014 Sep 18.
5
Empirical and deterministic accuracies of across-population genomic prediction.跨群体基因组预测的经验性和确定性准确性。
Genet Sel Evol. 2015 Feb 6;47(1):5. doi: 10.1186/s12711-014-0086-0.
6
Genome-wide prediction of traits with different genetic architecture through efficient variable selection.通过有效的变量选择对具有不同遗传结构的性状进行全基因组预测。
Genetics. 2013 Oct;195(2):573-87. doi: 10.1534/genetics.113.150078. Epub 2013 Aug 9.
7
Modeling Epistasis in Genomic Selection.遗传选择中的上位性建模。
Genetics. 2015 Oct;201(2):759-68. doi: 10.1534/genetics.115.177907. Epub 2015 Jul 27.
8
A comparison of five methods to predict genomic breeding values of dairy bulls from genome-wide SNP markers.比较五种方法从全基因组 SNP 标记预测奶牛公牛的基因组育种值。
Genet Sel Evol. 2009 Dec 31;41(1):56. doi: 10.1186/1297-9686-41-56.
9
Genomic BLUP decoded: a look into the black box of genomic prediction.基因组 BLUP 解码:探索基因组预测的黑箱。
Genetics. 2013 Jul;194(3):597-607. doi: 10.1534/genetics.113.152207. Epub 2013 May 2.
10
Prediction of complex human traits using the genomic best linear unbiased predictor.利用基因组最佳线性无偏预测器预测复杂人类特征。
PLoS Genet. 2013;9(7):e1003608. doi: 10.1371/journal.pgen.1003608. Epub 2013 Jul 11.

引用本文的文献

1
Multi-trait/environment sparse genomic prediction using the SFSI R-package.使用SFSI R包进行多性状/环境稀疏基因组预测。
Plant Genome. 2025 Jun;18(2):e70050. doi: 10.1002/tpg2.70050.
2
Genomic selection: Essence, applications, and prospects.基因组选择:本质、应用与前景。
Plant Genome. 2025 Jun;18(2):e70053. doi: 10.1002/tpg2.70053.
3
Breaking down data silos across companies to train genome-wide predictions: A feasibility study in wheat.打破公司间的数据孤岛以训练全基因组预测:小麦的可行性研究

本文引用的文献

1
Accounting for Group-Specific Allele Effects and Admixture in Genomic Predictions: Theory and Experimental Evaluation in Maize.在基因组预测中考虑群体特异性等位基因效应和混合:玉米中的理论和实验评估。
Genetics. 2020 Sep;216(1):27-41. doi: 10.1534/genetics.120.303278. Epub 2020 Jul 17.
2
Regularized selection indices for breeding value prediction using hyper-spectral image data.利用高光谱图像数据进行育种值预测的正则化选择指数。
Sci Rep. 2020 May 18;10(1):8195. doi: 10.1038/s41598-020-65011-2.
3
BGData - A Suite of R Packages for Genomic Analysis with Big Data.
Plant Biotechnol J. 2025 Jul;23(7):2704-2719. doi: 10.1111/pbi.70095. Epub 2025 Apr 20.
4
Use of multi-trait principal component selection index to identify fall armyworm () resistant maize genotypes.利用多性状主成分选择指数鉴定抗草地贪夜蛾玉米基因型。
Front Plant Sci. 2025 Mar 27;16:1544010. doi: 10.3389/fpls.2025.1544010. eCollection 2025.
5
Leveraging historical trials to predict Fusarium head blight resistance in spring wheat breeding programs.利用历史试验预测春小麦育种计划中的赤霉病抗性。
Plant Genome. 2025 Mar;18(1):e20559. doi: 10.1002/tpg2.20559.
6
Genome-wide association and genomic prediction for iron and zinc concentration and iron bioavailability in a collection of yellow dry beans.对一批黄干豆中铁和锌浓度以及铁生物利用度的全基因组关联研究和基因组预测
Front Genet. 2024 Feb 6;15:1330361. doi: 10.3389/fgene.2024.1330361. eCollection 2024.
7
The trend of breeding value research in animal science: bibliometric analysis.动物科学中育种值研究的趋势:文献计量分析
Arch Anim Breed. 2023 Jun 28;66(2):163-181. doi: 10.5194/aab-66-163-2023. eCollection 2023.
8
Genomic selection for morphological and yield-related traits using genome-wide SNPs in oil palm.利用油棕全基因组单核苷酸多态性(SNP)对形态和产量相关性状进行基因组选择。
Mol Breed. 2022 Nov 18;42(12):71. doi: 10.1007/s11032-022-01341-5. eCollection 2022 Dec.
9
Enviromic-based kernels may optimize resource allocation with multi-trait multi-environment genomic prediction for tropical Maize.基于环境组学的核函数可通过热带玉米的多性状多环境基因组预测来优化资源分配。
BMC Plant Biol. 2023 Jan 5;23(1):10. doi: 10.1186/s12870-022-03975-1.
10
Building a Calibration Set for Genomic Prediction, Characteristics to Be Considered, and Optimization Approaches.构建用于基因组预测的校准集、需考虑的特征及优化方法。
Methods Mol Biol. 2022;2467:77-112. doi: 10.1007/978-1-0716-2205-6_3.
BGData - 一套用于大数据基因组分析的 R 包。
G3 (Bethesda). 2019 May 7;9(5):1377-1383. doi: 10.1534/g3.119.400018.
4
Modeling Heterogeneity in the Genetic Architecture of Ethnically Diverse Groups Using Random Effect Interaction Models.使用随机效应交互模型对不同种族群体遗传结构的异质性进行建模。
Genetics. 2019 Apr;211(4):1395-1407. doi: 10.1534/genetics.119.301909. Epub 2019 Feb 22.
5
Design of training populations for selective phenotyping in genomic prediction.用于基因组预测中选择性表型分析的训练群体设计。
Sci Rep. 2019 Feb 5;9(1):1446. doi: 10.1038/s41598-018-38081-6.
6
Single-Step Genomic and Pedigree Genotype × Environment Interaction Models for Predicting Wheat Lines in International Environments.单步基因组和系谱基因型×环境互作模型预测国际环境中的小麦品系。
Plant Genome. 2017 Jul;10(2). doi: 10.3835/plantgenome2016.09.0089.
7
Updating the reference population to achieve constant genomic prediction reliability across generations.更新参考群体以实现跨世代基因组预测可靠性的恒定。
Animal. 2016 Jun;10(6):1018-24. doi: 10.1017/S1751731115002785. Epub 2015 Dec 29.
8
Incorporating Genetic Heterogeneity in Whole-Genome Regressions Using Interactions.利用相互作用在全基因组回归中纳入遗传异质性。
J Agric Biol Environ Stat. 2015;20(4):467-490. doi: 10.1007/s13253-015-0222-5. Epub 2015 Nov 9.
9
Assessment of Genetic Heterogeneity in Structured Plant Populations Using Multivariate Whole-Genome Regression Models.使用多变量全基因组回归模型评估结构化植物群体中的遗传异质性。
Genetics. 2015 Sep;201(1):323-37. doi: 10.1534/genetics.115.177394. Epub 2015 Jun 29.
10
Optimization of genomic selection training populations with a genetic algorithm.利用遗传算法优化基因组选择训练群体
Genet Sel Evol. 2015 May 6;47(1):38. doi: 10.1186/s12711-015-0116-6.