Kierdorf Jana, Stomberg Timo Tjarden, Drees Lukas, Rascher Uwe, Roscher Ribana
Remote Sensing Group, Institute of Geodesy and Geoinformation, University of Bonn, Bonn, Germany.
Institute of Bio- and Geosciences, IBG-2: Plant Sciences, Forschungszentrum Jülich GmbH, Jülich, Germany.
Front Artif Intell. 2024 Sep 18;7:1416323. doi: 10.3389/frai.2024.1416323. eCollection 2024.
Cauliflower cultivation is subject to high-quality control criteria during sales, which underlines the importance of accurate harvest timing. Using time series data for plant phenotyping can provide insights into the dynamic development of cauliflower and allow more accurate predictions of when the crop is ready for harvest than single-time observations. However, data acquisition on a daily or weekly basis is resource-intensive, making selection of acquisition days highly important. We investigate which data acquisition days and development stages positively affect the model accuracy to get insights into prediction-relevant observation days and aid future data acquisition planning. We analyze harvest-readiness using the cauliflower image time series of the GrowliFlower dataset. We use an adjusted ResNet18 classification model, including positional encoding of the data acquisition dates to add implicit information about development. The explainable machine learning approach GroupSHAP analyzes time points' contributions. Time points with the lowest mean absolute contribution are excluded from the time series to determine their effect on model accuracy. Using image time series rather than single time points, we achieve an increase in accuracy of 4%. GroupSHAP allows the selection of time points that positively affect the model accuracy. By using seven selected time points instead of all 11 ones, the accuracy improves by an additional 4%, resulting in an overall accuracy of 89.3%. The selection of time points may therefore lead to a reduction in data collection in the future.
花椰菜种植在销售过程中要遵循高质量控制标准,这凸显了准确收获时机的重要性。利用时间序列数据进行植物表型分析,可以深入了解花椰菜的动态发育情况,并且与单次观测相比,能够更准确地预测作物何时可以收获。然而,每天或每周进行数据采集资源消耗大,因此选择采集日期至关重要。我们研究哪些数据采集日期和发育阶段对模型准确性有积极影响,以深入了解与预测相关的观测日期,并为未来的数据采集计划提供帮助。我们使用GrowliFlower数据集的花椰菜图像时间序列来分析收获准备情况。我们使用一个经过调整的ResNet18分类模型,包括对数据采集日期进行位置编码,以添加有关发育的隐含信息。可解释机器学习方法GroupSHAP分析时间点的贡献。从时间序列中排除平均绝对贡献最低的时间点,以确定它们对模型准确性的影响。使用图像时间序列而非单个时间点,我们的准确率提高了4%。GroupSHAP允许选择对模型准确性有积极影响的时间点。通过使用七个选定的时间点而非全部11个时间点,准确率又提高了4%,总体准确率达到89.3%。因此,时间点的选择可能会在未来减少数据收集。