Suppr超能文献

基于主成分回归的基因组选择。

Genomic selection using principal component regression.

机构信息

Department of Botany and Plant Sciences, University of California, Riverside, CA, 92521, USA.

College of Animal Science and Technology, Nanjing Agricultural University, Nanjing, Jiangsu, China.

出版信息

Heredity (Edinb). 2018 Jul;121(1):12-23. doi: 10.1038/s41437-018-0078-x. Epub 2018 May 1.

Abstract

Many statistical methods are available for genomic selection (GS) through which genetic values of quantitative traits are predicted for plants and animals using whole-genome SNP data. A large number of predictors with much fewer subjects become a major computational challenge in GS. Principal components regression (PCR) and its derivative, i.e., partial least squares regression (PLSR), provide a solution through dimensionality reduction. In this study, we show that PCR can perform better than PLSR in cross validation. PCR often requires extracting more components to achieve the maximum predictive ability than PLSR and thus may be associated with a higher computational cost. However, application of the HAT method (a strategy of describing the relationship between the fitted and observed response variables with a hat matrix) to PCR circumvents conventional cross validation in testing predictive ability, resulting in substantially improved computational efficiency over PLSR where cross validation is mandatory. Advantages of PCR over PLSR are illustrated with a simulated trait of a hypothetical population and four agronomical traits of a rice population. The benefit of using PCR in genomic selection is further demonstrated in an effort to predict 1000 metabolomic traits and 24,973 transcriptomic traits in the same rice population.

摘要

许多统计方法可用于基因组选择(GS),通过使用全基因组 SNP 数据来预测植物和动物的数量性状的遗传值。在 GS 中,大量的预测因子与更少的主体成为一个主要的计算挑战。主成分回归(PCR)及其衍生方法,即偏最小二乘回归(PLSR),通过降维提供了一种解决方案。在这项研究中,我们表明 PCR 在交叉验证中可以比 PLSR 表现得更好。PCR 通常需要提取更多的成分来达到最大的预测能力,而不是 PLSR,因此可能与更高的计算成本有关。然而,应用 HAT 方法(一种用帽子矩阵描述拟合和观测响应变量之间关系的策略)到 PCR 中,可以避免传统的交叉验证测试预测能力,从而大大提高了计算效率,而 PLSR 则需要强制进行交叉验证。PCR 优于 PLSR 的优点通过一个假设群体的模拟性状和一个水稻群体的四个农艺性状来说明。在同一水稻群体中,PCR 在基因组选择中的应用进一步证明了其预测 1000 个代谢性状和 24973 个转录组性状的优势。

相似文献

1
Genomic selection using principal component regression.
Heredity (Edinb). 2018 Jul;121(1):12-23. doi: 10.1038/s41437-018-0078-x. Epub 2018 May 1.
2
Dimension reduction and variable selection for genomic selection: application to predicting milk yield in Holsteins.
J Anim Breed Genet. 2011 Aug;128(4):247-57. doi: 10.1111/j.1439-0388.2011.00917.x. Epub 2011 Mar 28.
3
Boosting predictabilities of agronomic traits in rice using bivariate genomic selection.
Brief Bioinform. 2021 May 20;22(3). doi: 10.1093/bib/bbaa103.
4
Predicted Residual Error Sum of Squares of Mixed Models: An Application for Genomic Prediction.
G3 (Bethesda). 2017 Mar 10;7(3):895-909. doi: 10.1534/g3.116.038059.
6
Identity-by-descent genomic selection using selective and sparse genotyping for binary traits.
Genet Sel Evol. 2015 Feb 22;47(1):8. doi: 10.1186/s12711-015-0090-z.
7
Identification of optimal prediction models using multi-omic data for selecting hybrid rice.
Heredity (Edinb). 2019 Sep;123(3):395-406. doi: 10.1038/s41437-019-0210-6. Epub 2019 Mar 25.

引用本文的文献

3
A dimensionality-reduction genomic prediction method without direct inverse of the genomic relationship matrix for large genomic data.
Plant Cell Rep. 2023 Nov;42(11):1825-1832. doi: 10.1007/s00299-023-03069-8. Epub 2023 Sep 26.
7
Sustainable fashion: Design of the experiment assisted machine learning for the environmental-friendly resin finishing of cotton fabric.
Heliyon. 2023 Jan 10;9(1):e12883. doi: 10.1016/j.heliyon.2023.e12883. eCollection 2023 Jan.
9
PCA outperforms popular hidden variable inference methods for molecular QTL mapping.
Genome Biol. 2022 Oct 11;23(1):210. doi: 10.1186/s13059-022-02761-4.
10
Genomic Prediction: Progress and Perspectives for Rice Improvement.
Methods Mol Biol. 2022;2467:569-617. doi: 10.1007/978-1-0716-2205-6_21.

本文引用的文献

1
Prediction and association mapping of agronomic traits in maize using multiple omic data.
Heredity (Edinb). 2017 Sep;119(3):174-184. doi: 10.1038/hdy.2017.27. Epub 2017 Jun 7.
2
Predicted Residual Error Sum of Squares of Mixed Models: An Application for Genomic Prediction.
G3 (Bethesda). 2017 Mar 10;7(3):895-909. doi: 10.1534/g3.116.038059.
3
Predicting hybrid performance in rice using genomic best linear unbiased prediction.
Proc Natl Acad Sci U S A. 2014 Aug 26;111(34):12456-61. doi: 10.1073/pnas.1413750111. Epub 2014 Aug 11.
4
Rapid screening for phenotype-genotype associations by linear transformations of genomic evaluations.
BMC Bioinformatics. 2014 Jul 19;15(1):246. doi: 10.1186/1471-2105-15-246.
6
Genetic analysis of the metabolome exemplified using a rice population.
Proc Natl Acad Sci U S A. 2013 Dec 10;110(50):20320-5. doi: 10.1073/pnas.1319681110. Epub 2013 Nov 20.
7
Whole-genome regression and prediction methods applied to plant and animal breeding.
Genetics. 2013 Feb;193(2):327-45. doi: 10.1534/genetics.112.143313. Epub 2012 Jun 28.
9
Dimension reduction and variable selection for genomic selection: application to predicting milk yield in Holsteins.
J Anim Breed Genet. 2011 Aug;128(4):247-57. doi: 10.1111/j.1439-0388.2011.00917.x. Epub 2011 Mar 28.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验