Suppr超能文献

预测基因组选择效率以优化校准集并评估高度结构化群体中的预测准确性。

Predicting genomic selection efficiency to optimize calibration set and to assess prediction accuracy in highly structured populations.

作者信息

Rincent R, Charcosset A, Moreau L

机构信息

INRA, UMR 1095 Génétique, Diversité et Ecophysiologie des Céréales, 5 chemin de Beaulieu, 63100, Clermont-Ferrand, France.

Université Blaise Pascal, UMR 1095 Génétique, Diversité et Ecophysiologie des Céréales, 63178, Aubière Cedex, France.

出版信息

Theor Appl Genet. 2017 Nov;130(11):2231-2247. doi: 10.1007/s00122-017-2956-7. Epub 2017 Aug 9.

Abstract

We propose a criterion to predict genomic selection efficiency for structured populations. This criterion is useful to define optimal calibration set and to estimate prediction reliability for multiparental populations. Genomic selection refers to the use of genotypic information for predicting the performance of selection candidates. It has been shown that prediction accuracy depends on various parameters including the composition of the calibration set (CS). Assessing the level of accuracy of a given prediction scenario is of highest importance because it can be used to optimize CS sampling before collecting phenotypes, and once the breeding values are predicted it informs the breeders about the reliability of these predictions. Different criteria were proposed to optimize CS sampling in highly diverse panels, which can be useful to screen collections of genotypes. But plant breeders often work on structured material such as biparental or multiparental populations, for which these criteria are less adapted. We derived from the generalized coefficient of determination (CD) theory different criteria to optimize CS sampling and to assess the reliability associated to predictions in structured populations. These criteria were evaluated on two nested association mapping (NAM) populations and two highly diverse panels of maize. They were efficient to sample optimized CS in most situations. They could also estimate at least partly the reliability associated to predictions between NAM families, but they could not estimate differences in the reliability associated to the predictions of NAM families using the highly diverse panels as calibration sets. We illustrated that the CD criteria could be adapted to various prediction scenarios including inter and intra-family predictions, resulting in higher prediction accuracies.

摘要

我们提出了一种用于预测结构化群体基因组选择效率的标准。该标准对于定义最佳校准集以及估计多亲本群体的预测可靠性很有用。基因组选择是指利用基因型信息来预测选择候选个体的表现。已经表明,预测准确性取决于各种参数,包括校准集(CS)的组成。评估给定预测方案的准确性水平至关重要,因为它可用于在收集表型之前优化校准集采样,并且一旦预测了育种值,它就能告知育种者这些预测的可靠性。人们提出了不同的标准来优化高度多样化群体中的校准集采样,这对于筛选基因型集合很有用。但是植物育种者通常处理的是结构化材料,如双亲或多亲本群体,而这些标准不太适用于此类群体。我们从广义决定系数(CD)理论推导了不同的标准,以优化校准集采样并评估结构化群体中预测的相关可靠性。这些标准在两个巢式关联作图(NAM)群体和两个高度多样化的玉米群体上进行了评估。在大多数情况下,它们能够有效地对优化后的校准集进行采样。它们还可以至少部分地估计NAM家系之间预测的相关可靠性,但对于使用高度多样化群体作为校准集的NAM家系预测,它们无法估计其可靠性差异。我们证明了CD标准可以适用于各种预测方案,包括家系间和家系内预测,从而提高预测准确性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd1d/5641287/ac564c60ef8b/122_2017_2956_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验