• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

在基因组预测中不进行交叉验证的交叉验证

Cross-Validation Without Doing Cross-Validation in Genome-Enabled Prediction.

作者信息

Gianola Daniel, Schön Chris-Carolin

机构信息

Department of Animal Sciences, University of Wisconsin-Madison, Wisconsin 53706 Department of Dairy Science, University of Wisconsin-Madison, Wisconsin 53706 Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Wisconsin 53706 Department of Plant Sciences, Technical University of Munich School of Life Sciences, Technical University of Munich, Garching, Germany Institute of Advanced Study, Technical University of Munich, Garching, Germany

Department of Plant Sciences, Technical University of Munich School of Life Sciences, Technical University of Munich, Garching, Germany Institute of Advanced Study, Technical University of Munich, Garching, Germany.

出版信息

G3 (Bethesda). 2016 Oct 13;6(10):3107-3128. doi: 10.1534/g3.116.033381.

DOI:10.1534/g3.116.033381
PMID:27489209
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5068934/
Abstract

Cross-validation of methods is an essential component of genome-enabled prediction of complex traits. We develop formulae for computing the predictions that would be obtained when one or several cases are removed in the training process, to become members of testing sets, but by running the model using all observations only once. Prediction methods to which the developments apply include least squares, best linear unbiased prediction (BLUP) of markers, or genomic BLUP, reproducing kernels Hilbert spaces regression with single or multiple kernel matrices, and any member of a suite of linear regression methods known as "Bayesian alphabet." The approach used for Bayesian models is based on importance sampling of posterior draws. Proof of concept is provided by applying the formulae to a wheat data set representing 599 inbred lines genotyped for 1279 markers, and the target trait was grain yield. The data set was used to evaluate predictive mean-squared error, impact of alternative layouts on maximum likelihood estimates of regularization parameters, model complexity, and residual degrees of freedom stemming from various strengths of regularization, as well as two forms of importance sampling. Our results will facilitate carrying out extensive cross-validation without model retraining for most machines employed in genome-assisted prediction of quantitative traits.

摘要

方法的交叉验证是基于基因组的复杂性状预测的重要组成部分。我们开发了一些公式,用于计算在训练过程中移除一个或多个样本使其成为测试集成员,但仅使用所有观测值运行模型一次时所获得的预测结果。这些开发成果所适用的预测方法包括最小二乘法、标记的最佳线性无偏预测(BLUP)或基因组BLUP、使用单个或多个核矩阵的再生核希尔伯特空间回归,以及一组被称为“贝叶斯字母表”的线性回归方法中的任何一种。用于贝叶斯模型的方法基于后验抽样的重要性抽样。通过将这些公式应用于一个小麦数据集来提供概念验证,该数据集代表了599个自交系,对1279个标记进行了基因分型,目标性状是籽粒产量。该数据集用于评估预测均方误差、替代布局对正则化参数最大似然估计的影响、模型复杂性以及来自各种正则化强度的残差自由度,以及两种形式的重要性抽样。我们的结果将有助于在不进行模型重新训练的情况下,对大多数用于基因组辅助数量性状预测的机器进行广泛的交叉验证。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6d4f/5068934/1f376e2ab71b/3107f13.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6d4f/5068934/58a8b90322a0/3107f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6d4f/5068934/6fdc31af3757/3107f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6d4f/5068934/a50e89b86973/3107f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6d4f/5068934/b8c82158fbac/3107f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6d4f/5068934/dfad8748d091/3107f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6d4f/5068934/7909b05dd64a/3107f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6d4f/5068934/26b97d99bd30/3107f7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6d4f/5068934/9dd09dabb9ef/3107f8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6d4f/5068934/be95804851bf/3107f9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6d4f/5068934/a8bdc05c99d0/3107f10.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6d4f/5068934/9e969d0f1cac/3107f11.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6d4f/5068934/9d931144378c/3107f12.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6d4f/5068934/1f376e2ab71b/3107f13.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6d4f/5068934/58a8b90322a0/3107f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6d4f/5068934/6fdc31af3757/3107f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6d4f/5068934/a50e89b86973/3107f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6d4f/5068934/b8c82158fbac/3107f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6d4f/5068934/dfad8748d091/3107f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6d4f/5068934/7909b05dd64a/3107f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6d4f/5068934/26b97d99bd30/3107f7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6d4f/5068934/9dd09dabb9ef/3107f8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6d4f/5068934/be95804851bf/3107f9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6d4f/5068934/a8bdc05c99d0/3107f10.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6d4f/5068934/9e969d0f1cac/3107f11.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6d4f/5068934/9d931144378c/3107f12.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6d4f/5068934/1f376e2ab71b/3107f13.jpg

相似文献

1
Cross-Validation Without Doing Cross-Validation in Genome-Enabled Prediction.在基因组预测中不进行交叉验证的交叉验证
G3 (Bethesda). 2016 Oct 13;6(10):3107-3128. doi: 10.1534/g3.116.033381.
2
A Multiple-Trait Bayesian Lasso for Genome-Enabled Analysis and Prediction of Complex Traits.用于基于基因组的复杂性状分析与预测的多性状贝叶斯套索法
Genetics. 2020 Feb;214(2):305-331. doi: 10.1534/genetics.119.302934. Epub 2019 Dec 26.
3
Comparison of alternative approaches to single-trait genomic prediction using genotyped and non-genotyped Hanwoo beef cattle.使用基因分型和非基因分型的韩牛对单性状基因组预测的替代方法进行比较。
Genet Sel Evol. 2017 Jan 4;49(1):2. doi: 10.1186/s12711-016-0279-9.
4
Model averaging for genome-enabled prediction with reproducing kernel Hilbert spaces: a case study with pig litter size and wheat yield.基于再生核希尔伯特空间的基因组预测模型平均法:以猪产仔数和小麦产量为例的研究
J Anim Breed Genet. 2014 Apr;131(2):105-15. doi: 10.1111/jbg.12070. Epub 2014 Jan 8.
5
Genomic prediction based on data from three layer lines: a comparison between linear methods.基于三层品系数据的基因组预测:线性方法之间的比较
Genet Sel Evol. 2014 Oct 1;46(1):57. doi: 10.1186/s12711-014-0057-5.
6
Genome-wide prediction using Bayesian additive regression trees.使用贝叶斯加法回归树进行全基因组预测。
Genet Sel Evol. 2016 Jun 10;48(1):42. doi: 10.1186/s12711-016-0219-8.
7
Whole-genome sequence-based genomic prediction in laying chickens with different genomic relationship matrices to account for genetic architecture.利用不同基因组关系矩阵在蛋鸡中基于全基因组序列进行基因组预测以考虑遗传结构。
Genet Sel Evol. 2017 Jan 16;49(1):8. doi: 10.1186/s12711-016-0277-y.
8
Genomic Prediction Accounting for Residual Heteroskedasticity.考虑残余异方差性的基因组预测
G3 (Bethesda). 2015 Nov 12;6(1):1-13. doi: 10.1534/g3.115.022897.
9
Parametric and nonparametric statistical methods for genomic selection of traits with additive and epistatic genetic architectures.用于具有加性和上位性遗传结构的性状基因组选择的参数和非参数统计方法。
G3 (Bethesda). 2014 Apr 11;4(6):1027-46. doi: 10.1534/g3.114.010298.
10
Semi-parametric genomic-enabled prediction of genetic values using reproducing kernel Hilbert spaces methods.使用再生核希尔伯特空间方法对遗传值进行半参数基因组预测。
Genet Res (Camb). 2010 Aug;92(4):295-308. doi: 10.1017/S0016672310000285.

引用本文的文献

1
Rapid cycling genomic selection in maize landraces.玉米地方品种的快速循环基因组选择
Theor Appl Genet. 2025 Mar 17;138(4):75. doi: 10.1007/s00122-025-04855-6.
2
On the ability of the LR method to detect bias when there is pedigree misspecification and lack of connectedness.当存在家系误判和不连通时,LR 方法检测偏差的能力。
Genet Sel Evol. 2024 Nov 21;56(1):74. doi: 10.1186/s12711-024-00943-1.
3
Ability of Genomic Prediction to Bi-Parent-Derived Breeding Population Using Public Data for Soybean Oil and Protein Content.

本文引用的文献

1
Epistasis and covariance: how gene interaction translates into genomic relationship.上位性和协方差:基因互作如何转化为基因组关系。
Theor Appl Genet. 2016 May;129(5):963-76. doi: 10.1007/s00122-016-2675-5. Epub 2016 Feb 16.
2
Using genomics to enhance selection of novel traits in North American dairy cattle.利用基因组学提升北美奶牛新性状的选择
J Dairy Sci. 2016 Mar;99(3):2413-2427. doi: 10.3168/jds.2015-9970. Epub 2016 Jan 6.
3
Modeling Epistasis in Genomic Selection.遗传选择中的上位性建模。
利用公开数据对大豆油和蛋白质含量双亲衍生育种群体进行基因组预测的能力。
Plants (Basel). 2024 Apr 30;13(9):1260. doi: 10.3390/plants13091260.
4
Confidence intervals for validation statistics with data truncation in genomic prediction.截断数据下基因组预测中验证统计量的置信区间。
Genet Sel Evol. 2024 Mar 8;56(1):18. doi: 10.1186/s12711-024-00883-w.
5
Inference about quantitative traits under selection: a Bayesian revisitation for the post-genomic era.选择下数量性状的推断:后基因组时代的贝叶斯再探讨。
Genet Sel Evol. 2022 Dec 2;54(1):78. doi: 10.1186/s12711-022-00765-z.
6
Genomic Prediction: Progress and Perspectives for Rice Improvement.基因组预测:水稻改良的进展与展望
Methods Mol Biol. 2022;2467:569-617. doi: 10.1007/978-1-0716-2205-6_21.
7
High accuracy of genome-enabled prediction of belowground and physiological traits in barley seedlings.基因组增强预测大麦幼苗地下和生理性状的高度准确性。
G3 (Bethesda). 2022 Mar 4;12(3). doi: 10.1093/g3journal/jkac022.
8
Genomic Prediction of Grain Yield in a Barley MAGIC Population Modeling Genotype per Environment Interaction.大麦多亲本高级世代互交群体中产量的基因组预测:环境互作下的基因型建模
Front Plant Sci. 2021 May 24;12:664148. doi: 10.3389/fpls.2021.664148. eCollection 2021.
9
Modeling genetic differences of combined broiler chicken populations in single-step GBLUP.基于一步法 GBLUP 模型对肉鸡组合群体遗传差异的分析。
J Anim Sci. 2021 Apr 1;99(4). doi: 10.1093/jas/skab056.
10
Identification of candidate genes encoding tumor-specific neoantigens in early- and late-stage colon adenocarcinoma.鉴定早期和晚期结肠腺癌中编码肿瘤特异性新抗原的候选基因。
Aging (Albany NY). 2021 Jan 10;13(3):4024-4044. doi: 10.18632/aging.202370.
Genetics. 2015 Oct;201(2):759-68. doi: 10.1534/genetics.115.177907. Epub 2015 Jul 27.
4
Accounting for genetic architecture improves sequence based genomic prediction for a Drosophila fitness trait.考虑遗传结构可改善基于序列的果蝇适应性性状基因组预测。
PLoS One. 2015 May 7;10(5):e0126880. doi: 10.1371/journal.pone.0126880. eCollection 2015.
5
Genomic prediction of complex human traits: relatedness, trait architecture and predictive meta-models.复杂人类性状的基因组预测:亲缘关系、性状结构和预测性元模型。
Hum Mol Genet. 2015 Jul 15;24(14):4167-82. doi: 10.1093/hmg/ddv145. Epub 2015 Apr 26.
6
One hundred years of statistical developments in animal breeding.动物育种一百年的统计发展。
Annu Rev Anim Biosci. 2015;3:19-56. doi: 10.1146/annurev-animal-022114-110733. Epub 2014 Nov 3.
7
Training set optimization under population structure in genomic selection.基因组选择中群体结构下的训练集优化
Theor Appl Genet. 2015 Jan;128(1):145-58. doi: 10.1007/s00122-014-2418-4. Epub 2014 Nov 1.
8
Genome-wide regression and prediction with the BGLR statistical package.使用BGLR统计软件包进行全基因组回归与预测。
Genetics. 2014 Oct;198(2):483-95. doi: 10.1534/genetics.114.164442. Epub 2014 Jul 9.
9
Whole genome prediction of bladder cancer risk with the Bayesian LASSO.使用贝叶斯套索法对膀胱癌风险进行全基因组预测。
Genet Epidemiol. 2014 Jul;38(5):467-76. doi: 10.1002/gepi.21809. Epub 2014 May 5.
10
Enhancing genome-enabled prediction by bagging genomic BLUP.通过对基因组最佳线性无偏预测(GBLUP)进行装袋法来增强基于基因组的预测。
PLoS One. 2014 Apr 10;9(4):e91693. doi: 10.1371/journal.pone.0091693. eCollection 2014.