Singh Geethen, Moncrieff Glenn, Venter Zander, Cawse-Nicholson Kerry, Slingsby Jasper, Robinson Tamara B
Department of Botany and Zoology, Centre for Invasion Biology, Stellenbosch University, Stellenbosch, South Africa.
Global Science, The Nature Conservancy, Cape Town, 7945, South Africa.
Sci Rep. 2024 Jul 13;14(1):16166. doi: 10.1038/s41598-024-65954-w.
Machine learning is increasingly applied to Earth Observation (EO) data to obtain datasets that contribute towards international accords. However, these datasets contain inherent uncertainty that needs to be quantified reliably to avoid negative consequences. In response to the increased need to report uncertainty, we bring attention to the promise of conformal prediction within the domain of EO. Unlike previous uncertainty quantification methods, conformal prediction offers statistically valid prediction regions while concurrently supporting any machine learning model and data distribution. To support the need for conformal prediction, we reviewed EO datasets and found that only 22.5% of the datasets incorporated a degree of uncertainty information, with unreliable methods prevalent. Current open implementations require moving large amounts of EO data to the algorithms. We introduced Google Earth Engine native modules that bring conformal prediction to the data and compute, facilitating the integration of uncertainty quantification into existing traditional and deep learning modelling workflows. To demonstrate the versatility and scalability of these tools we apply them to valued EO applications spanning local to global extents, regression, and classification tasks. Subsequently, we discuss the opportunities arising from the use of conformal prediction in EO. We anticipate that accessible and easy-to-use tools, such as those provided here, will drive wider adoption of rigorous uncertainty quantification in EO, thereby enhancing the reliability of downstream uses such as operational monitoring and decision-making.
机器学习越来越多地应用于地球观测(EO)数据,以获取有助于达成国际协定的数据集。然而,这些数据集存在内在的不确定性,需要进行可靠的量化,以避免产生负面后果。为应对报告不确定性的需求增加,我们关注到共形预测在地球观测领域的前景。与以往的不确定性量化方法不同,共形预测提供了统计上有效的预测区域,同时支持任何机器学习模型和数据分布。为支持共形预测的需求,我们审查了地球观测数据集,发现只有22.5%的数据集纳入了一定程度的不确定性信息,且普遍采用不可靠的方法。当前的开放实现需要将大量地球观测数据传输到算法中。我们引入了谷歌地球引擎原生模块,将共形预测带到数据和计算中,便于将不确定性量化集成到现有的传统和深度学习建模工作流程中。为展示这些工具的通用性和可扩展性,我们将它们应用于从局部到全球范围、回归和分类任务等有价值的地球观测应用中。随后,我们讨论了在地球观测中使用共形预测所带来的机遇。我们预计,像这里提供的这些易于使用的工具,将推动在地球观测中更广泛地采用严格的不确定性量化,从而提高诸如业务监测和决策等下游用途的可靠性。