Chen Tai-Shen, Aoike Toru, Yamasaki Masanori, Kajiya-Kanegae Hiromi, Iwata Hiroyoshi
Graduate School of Agricultural and Life Sciences, The University of Tokyo, Bunkyo, Japan.
Food Resources Education and Research Center, Graduate School of Agricultural Science, Kobe University, Kasai, Hyogo, Japan.
Front Genet. 2020 Dec 18;11:599510. doi: 10.3389/fgene.2020.599510. eCollection 2020.
Accurate prediction of heading date under various environmental conditions is expected to facilitate the decision-making process in cultivation management and the breeding process of new cultivars adaptable to the environment. Days to heading (DTH) is a complex trait known to be controlled by multiple genes and genotype-by-environment interactions. Crop growth models (CGMs) have been widely used to predict the phenological development of a plant in an environment; however, they usually require substantial experimental data to calibrate the parameters of the model. The parameters are mostly genotype-specific and are thus usually estimated separately for each cultivar. We propose an integrated approach that links genotype marker data with the developmental genotype-specific parameters of CGMs with a machine learning model, and allows heading date prediction of a new genotype in a new environment. To estimate the parameters, we implemented a Bayesian approach with the advanced Markov chain Monte-Carlo algorithm called the differential evolution adaptive metropolis and conducted the estimation using a large amount of data on heading date and environmental variables. The data comprised sowing and heading dates of 112 cultivars/lines tested at 7 locations for 14 years and the corresponding environmental variables (day length and daily temperature). We compared the predictive accuracy of DTH between the proposed approach, a CGM, and a single machine learning model. The results showed that the extreme learning machine (one of the implemented machine learning models) was superior to the CGM for the prediction of a tested genotype in a tested location. The proposed approach outperformed the machine learning method in the prediction of an untested genotype in an untested location. We also evaluated the potential of the proposed approach in the prediction of the distribution of DTH in 103 F segregation populations derived from crosses between a common parent, Koshihikari, and 103 cultivars/lines. The results showed a high correlation coefficient (ca. 0.8) of the 10, 50, and 90th percentiles of the observed and predicted distribution of DTH. In this study, the integration of a machine learning model and a CGM was better able to predict the heading date of a new rice cultivar in an untested potential environment.
准确预测不同环境条件下的抽穗期,有望促进栽培管理决策过程以及培育适应环境的新品种的育种过程。抽穗天数(DTH)是一个复杂性状,已知受多个基因以及基因型与环境互作的控制。作物生长模型(CGMs)已被广泛用于预测植物在某一环境中的物候发育;然而,它们通常需要大量实验数据来校准模型参数。这些参数大多是基因型特异性的,因此通常针对每个品种单独估计。我们提出一种综合方法,该方法通过机器学习模型将基因型标记数据与CGMs的发育基因型特异性参数联系起来,并能够预测新环境中新型基因型的抽穗期。为了估计参数,我们采用了一种贝叶斯方法,结合名为差分进化自适应 metropolis的先进马尔可夫链蒙特卡罗算法,并使用大量关于抽穗期和环境变量的数据进行估计。数据包括112个品种/品系在7个地点经过14年测试的播种和抽穗日期以及相应的环境变量(日长和日温度)。我们比较了所提出的方法、一个CGM和一个单一机器学习模型在预测DTH方面的准确性。结果表明,极限学习机(所采用的机器学习模型之一)在预测测试地点的测试基因型方面优于CGM。在所提出的方法在预测未测试地点的未测试基因型方面优于机器学习方法。我们还评估了所提出的方法在预测由一个共同亲本越光与103个品种/品系杂交产生的103个F分离群体中DTH分布的潜力。结果显示,DTH观测分布和预测分布的第10、50和90百分位数的相关系数较高(约为0.8)。在本研究中,机器学习模型和CGM的整合能够更好地预测未测试潜在环境中新水稻品种的抽穗期。