Pugh N Ace, Young Andrew, Ojha Manisha, Emendack Yves, Sanchez Jacobo, Xin Zhanguo, Puppala Naveen
United States Department of Agriculture, Crop Stress Research Laboratory, Lubbock, TX, United States.
Agricultural Science Center at Clovis, New Mexico State University, Clovis, NM, United States.
Front Plant Sci. 2024 Feb 20;15:1339864. doi: 10.3389/fpls.2024.1339864. eCollection 2024.
Peanut is a critical food crop worldwide, and the development of high-throughput phenotyping techniques is essential for enhancing the crop's genetic gain rate. Given the obvious challenges of directly estimating peanut yields through remote sensing, an approach that utilizes above-ground phenotypes to estimate underground yield is necessary. To that end, this study leveraged unmanned aerial vehicles (UAVs) for high-throughput phenotyping of surface traits in peanut. Using a diverse set of peanut germplasm planted in 2021 and 2022, UAV flight missions were repeatedly conducted to capture image data that were used to construct high-resolution multitemporal sigmoidal growth curves based on apparent characteristics, such as canopy cover and canopy height. Latent phenotypes extracted from these growth curves and their first derivatives informed the development of advanced machine learning models, specifically random forest and eXtreme Gradient Boosting (XGBoost), to estimate yield in the peanut plots. The random forest model exhibited exceptional predictive accuracy (R = 0.93), while XGBoost was also reasonably effective (R = 0.88). When using confusion matrices to evaluate the classification abilities of each model, the two models proved valuable in a breeding pipeline, particularly for filtering out underperforming genotypes. In addition, the random forest model excelled in identifying top-performing material while minimizing Type I and Type II errors. Overall, these findings underscore the potential of machine learning models, especially random forests and XGBoost, in predicting peanut yield and improving the efficiency of peanut breeding programs.
花生是全球重要的粮食作物,高通量表型分析技术的发展对于提高该作物的遗传增益率至关重要。鉴于通过遥感直接估算花生产量存在明显挑战,需要一种利用地上表型来估算地下产量的方法。为此,本研究利用无人机对花生的地表性状进行高通量表型分析。使用2021年和2022年种植的多种花生种质,反复执行无人机飞行任务以获取图像数据,这些数据用于基于冠层覆盖和冠层高度等表观特征构建高分辨率多时间点S形生长曲线。从这些生长曲线及其一阶导数中提取的潜在表型为先进机器学习模型(特别是随机森林和极端梯度提升(XGBoost))的开发提供了信息,以估算花生地块的产量。随机森林模型表现出卓越的预测准确性(R = 0.93),而XGBoost也相当有效(R = 0.88)。当使用混淆矩阵评估每个模型的分类能力时,这两个模型在育种流程中被证明是有价值的,特别是用于筛选表现不佳的基因型。此外,随机森林模型在识别表现最佳的材料方面表现出色,同时将I型和II型错误降至最低。总体而言,这些发现强调了机器学习模型,特别是随机森林和XGBoost,在预测花生产量和提高花生育种计划效率方面的潜力。