Department of Agronomy, Iowa State University, Ames, IA, USA.
Department of Mechanical Engineering, Iowa State University, Ames, IA, USA.
Sci Rep. 2019 Nov 20;9(1):17132. doi: 10.1038/s41598-019-53451-4.
We explored the capability of fusing high dimensional phenotypic trait (phenomic) data with a machine learning (ML) approach to provide plant breeders the tools to do both in-season seed yield (SY) prediction and prescriptive cultivar development for targeted agro-management practices (e.g., row spacing and seeding density). We phenotyped 32 SoyNAM parent genotypes in two independent studies each with contrasting agro-management treatments (two row spacing, three seeding densities). Phenotypic trait data (canopy temperature, chlorophyll content, hyperspectral reflectance, leaf area index, and light interception) were generated using an array of sensors at three growth stages during the growing season and seed yield (SY) determined by machine harvest. Random forest (RF) was used to train models for SY prediction using phenotypic traits (predictor variables) to identify the optimal temporal combination of variables to maximize accuracy and resource allocation. RF models were trained using data from both experiments and individually for each agro-management treatment. We report the most important traits agnostic of agro-management practices. Several predictor variables showed conditional importance dependent on the agro-management system. We assembled predictive models to enable in-season SY prediction, enabling the development of a framework to integrate phenomics information with powerful ML for prediction enabled prescriptive plant breeding.
我们探索了融合高维表型特征(表型)数据与机器学习(ML)方法的能力,为植物育种者提供了工具,既能进行季节内种子产量(SY)预测,又能针对目标农业管理实践(例如,行间距和播种密度)进行规定性品种开发。我们在两个独立的研究中对 32 个 SoyNAM 亲本基因型进行了表型分析,每个研究都有不同的农业管理处理(两行间距,三行播种密度)。使用生长季节三个生长阶段的一系列传感器生成表型特征数据(冠层温度、叶绿素含量、高光谱反射率、叶面积指数和光截获),并通过机器收获确定种子产量(SY)。随机森林(RF)用于使用表型特征(预测变量)训练 SY 预测模型,以确定最佳的变量时间组合,以最大化准确性和资源分配。RF 模型使用来自两个实验的数据进行训练,并分别针对每个农业管理处理进行训练。我们报告了与农业管理实践无关的最重要特征。一些预测变量显示出依赖于农业管理系统的条件重要性。我们组装了预测模型,以实现季节内 SY 预测,从而建立了一个框架,将表型组学信息与强大的 ML 预测集成,以实现规定性植物育种。