Cowlishaw Robert, Longépé Nicolas, Riccardi Annalisa
Mechanical and Aerospace Engineering, University of Strathclyde, Glasgow, G1 1XQ, UK.
Ф-lab, European Space Agency, 00044, Frascati, Italy.
Sci Rep. 2025 Mar 26;15(1):10454. doi: 10.1038/s41598-025-94244-2.
Crop yield prediction using Earth Observation data presents challenges due to the diverse data modalities and the limited availability of relevant datasets, which are often proprietary or private. Decentralised federated learning has been proposed as a solution to address these privacy concerns as no data labels will have to be distributed to a third party. However, the performance of federated learning is significantly influenced by the number of clients and the distribution of data among them. This study investigates the impact of aggregation levels on federated learning using a proxy model trained on crop type data derived from Copernicus Sentinel-2 images. Interaction of these aggregation levels with other parameters is simulated and studied to aim to generalise the results to different situations. The analysis also includes an examination of the current and future distributions of crop yield datasets to determine the optimal aggregation levels for effective federated learning. The findings highlight that dataset size directly affects the learning outcomes as well as the degree of privacy that can be maintained. Other scenarios and the implications of these results are discussed for a future crop-yield decentralised federated learning architecture.
利用地球观测数据进行作物产量预测面临诸多挑战,这是由于数据模态多样以及相关数据集的可用性有限,这些数据集往往是专有或私有的。去中心化联邦学习已被提议作为解决这些隐私问题的一种方案,因为无需将数据标签分发给第三方。然而,联邦学习的性能受到客户端数量及其之间数据分布的显著影响。本研究使用基于哥白尼哨兵 - 2 图像得出的作物类型数据训练的代理模型,调查聚合级别对联邦学习的影响。模拟并研究了这些聚合级别与其他参数的相互作用,旨在将结果推广到不同情况。该分析还包括对作物产量数据集当前和未来分布的考察,以确定有效联邦学习的最佳聚合级别。研究结果突出表明,数据集大小直接影响学习成果以及可维持的隐私程度。针对未来作物产量去中心化联邦学习架构,讨论了其他场景以及这些结果。