Seed Science Center, College of Agriculture and Biotechnology, Zhejiang University, Hangzhou 310058, China.
J Zhejiang Univ Sci B. 2013 Feb;14(2):162-70. doi: 10.1631/jzus.B1200075.
A cotton germplasm collection with data for 20 quantitative traits was used to investigate the effect of the scale of quantitative trait data on the representativeness of plant sub-core collections. The relationship between the representativeness of a sub-core collection and two influencing factors, the number of traits and the sampling percentage, was studied. A mixed linear model approach was used to eliminate environmental errors and predict genotypic values of accessions. Sub-core collections were constructed using a least distance stepwise sampling (LDSS) method combining standardized Euclidean distance and an unweighted pair-group method with arithmetic means (UPGMA) cluster method. The mean difference percentage (MD), variance difference percentage (VD), coincidence rate of range (CR), and variable rate of coefficient of variation (VR) served as evaluation parameters. Monte Carlo simulation was conducted to study the relationship among the number of traits, the sampling percentage, and the four evaluation parameters. The results showed that the representativeness of a sub-core collection was affected greatly by the number of traits and the sampling percentage, and that these two influencing factors were closely connected. Increasing the number of traits improved the representativeness of a sub-core collection when the data of genotypic values were used. The change in the genetic diversity of sub-core collections with different sampling percentages showed a linear tendency when the number of traits was small, and a logarithmic tendency when the number of traits was large. However, the change in the genetic diversity of sub-core collections with different numbers of traits always showed a strong logarithmic tendency when the sampling percentage was changing. A CR threshold method based on Monte Carlo simulation is proposed to determine the rational number of traits for a relevant sampling percentage of a sub-core collection.
利用包含 20 个数量性状数据的棉花种质资源收集材料,研究了数量性状数据规模对植物亚核心取样代表性的影响。研究了亚核心取样代表性与两个影响因素(性状数量和抽样百分比)的关系。采用混合线性模型方法消除环境误差并预测供体的基因型值。采用标准化欧式距离和非加权对群算术平均聚类(UPGMA)相结合的最小距离逐步取样(LDSS)方法构建亚核心取样。均值差百分比(MD)、方差差百分比(VD)、范围一致率(CR)和变异系数变率(VR)作为评价参数。采用蒙特卡罗模拟研究性状数量、抽样百分比与四个评价参数之间的关系。结果表明,亚核心取样的代表性受性状数量和抽样百分比的影响较大,这两个影响因素密切相关。在使用基因型值数据时,增加性状数量可以提高亚核心取样的代表性。当性状数量较小时,不同抽样百分比的亚核心取样遗传多样性的变化呈线性趋势,而当性状数量较大时,呈对数趋势。然而,当抽样百分比变化时,不同性状数量的亚核心取样遗传多样性的变化总是呈强对数趋势。提出了一种基于蒙特卡罗模拟的 CR 阈值方法,用于确定亚核心取样相关抽样百分比的合理性状数量。