Suppr超能文献

利用陆地地球观测数据量化植被特征的主动学习研究

A Survey of Active Learning for Quantifying Vegetation Traits from Terrestrial Earth Observation Data.

作者信息

Berger Katja, Caicedo Juan Pablo Rivera, Martino Luca, Wocher Matthias, Hank Tobias, Verrelst Jochem

机构信息

Department of Geography, Ludwig-Maximilians-Universität München (LMU), Luisenstr. 37, 80333 Munich, Germany.

Secretary of Research and Graduate Studies, CONACYT-UAN, 63155 Tepic, Nayarit, Mexico.

出版信息

Remote Sens (Basel). 2021 Jan 15;13(2):287. doi: 10.3390/rs13020287.

Abstract

The current exponential increase of spatiotemporally explicit data streams from satellitebased Earth observation missions offers promising opportunities for global vegetation monitoring. Intelligent sampling through active learning (AL) heuristics provides a pathway for fast inference of essential vegetation variables by means of hybrid retrieval approaches, i.e., machine learning regression algorithms trained by radiative transfer model (RTM) simulations. In this study we summarize AL theory and perform a brief systematic literature survey about AL heuristics used in the context of Earth observation regression problems over terrestrial targets. Across all relevant studies it appeared that: (i) retrieval accuracy of AL-optimized training data sets outperformed models trained over large randomly sampled data sets, and (ii) Euclidean distance-based (EBD) diversity method tends to be the most efficient AL technique in terms of accuracy and computational demand. Additionally, a case study is presented based on experimental data employing both uncertainty and diversity AL criteria. Hereby, a a simulated training data base by the PROSAIL-PRO canopy RTM is used to demonstrate the benefit of AL techniques for the estimation of total leaf carotenoid content ( ) and leaf water content ( ). Gaussian process regression (GPR) was incorporated to minimize and optimize the training data set with AL. Training the GPR algorithm on optimally AL-based sampled data sets led to improved variable retrievals compared to training on full data pools, which is further demonstrated on a mapping example. From these findings we can recommend the use of AL-based sub-sampling procedures to select the most informative samples out of large training data pools. This will not only optimize regression accuracy due to exclusion of redundant information, but also speed up processing time and reduce final model size of kernel-based machine learning regression algorithms, such as GPR. With this study we want to encourage further testing and implementation of AL sampling methods for hybrid retrieval workflows. AL can contribute to the solution of regression problems within the framework of operational vegetation monitoring using satellite imaging spectroscopy data, and may strongly facilitate data processing for cloud-computing platforms.

摘要

当前,基于卫星的地球观测任务所产生的时空明确数据流呈指数级增长,为全球植被监测提供了广阔的机遇。通过主动学习(AL)启发式方法进行智能采样,为借助混合检索方法快速推断关键植被变量提供了一条途径,即通过辐射传输模型(RTM)模拟训练的机器学习回归算法。在本研究中,我们总结了AL理论,并对在陆地目标地球观测回归问题背景下使用的AL启发式方法进行了简要的系统文献综述。在所有相关研究中,似乎有以下两点:(i)AL优化训练数据集的检索精度优于在大量随机采样数据集上训练的模型;(ii)基于欧几里得距离(EBD)的多样性方法在准确性和计算需求方面往往是最有效的AL技术。此外,基于同时采用不确定性和多样性AL标准的实验数据给出了一个案例研究。在此,利用PROSAIL-PRO冠层RTM建立的模拟训练数据库被用于证明AL技术在估计总叶类胡萝卜素含量( )和叶含水量( )方面的优势。纳入高斯过程回归(GPR)以通过AL最小化并优化训练数据集。与在完整数据集上训练相比,在基于AL的最优采样数据集上训练GPR算法可改进变量检索,这在一个映射示例中得到了进一步证明。基于这些发现,我们建议使用基于AL的子采样程序从大型训练数据集中选择信息最丰富的样本。这不仅会由于排除冗余信息而优化回归精度,还会加快处理时间并减小基于核的机器学习回归算法(如GPR)的最终模型大小。通过本研究,我们希望鼓励进一步测试和实施用于混合检索工作流程的AL采样方法。AL可有助于解决使用卫星成像光谱数据进行业务植被监测框架内的回归问题,并可能极大地促进云计算平台的数据处理。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e83e/7613397/942666675e37/EMS152662-f001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验