• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

在Kubernetes上使用带有容器化Spark引擎的弹性计算范式对遥感大数据进行实时融合。

On-the-Fly Fusion of Remotely-Sensed Big Data Using an Elastic Computing Paradigm with a Containerized Spark Engine on Kubernetes.

作者信息

Huang Wei, Zhou Jianzhong, Zhang Dongying

机构信息

School of Civil and Hydraulic Engineering, Huazhong University of Science and Technology, Wuhan 430074, China.

出版信息

Sensors (Basel). 2021 Apr 23;21(9):2971. doi: 10.3390/s21092971.

DOI:10.3390/s21092971
PMID:33922709
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8122984/
Abstract

Remotely-sensed satellite image fusion is indispensable for the generation of long-term gap-free Earth observation data. While cloud computing (CC) provides the big picture for RS big data (RSBD), the fundamental question of the efficient fusion of RSBD on CC platforms has not yet been settled. To this end, we propose a lightweight cloud-native framework for the elastic processing of RSBD in this study. With the scaling mechanisms provided by both the Infrastructure as a Service (IaaS) and Platform as a Services (PaaS) of CC, the Spark-on-Kubernetes operator model running in the framework can enhance the efficiency of Spark-based algorithms without considering bottlenecks such as task latency caused by an unbalanced workload, and can ease the burden to tune the performance parameters for their parallel algorithms. Internally, we propose a task scheduling mechanism (TSM) to dynamically change the Spark executor pods' affinities to the computing hosts. The TSM learns the workload of a computing host. Learning from the ratio between the number of completed and failed tasks on a computing host, the TSM dispatches Spark executor pods to newer and less-overwhelmed computing hosts. In order to illustrate the advantage, we implement a parallel enhanced spatial and temporal adaptive reflectance fusion model (PESTARFM) to enable the efficient fusion of big RS images with a Spark aggregation function. We construct an OpenStack cloud computing environment to test the usability of the framework. According to the experiments, TSM can improve the performance of the PESTARFM using only PaaS scaling to about 11.7%. When using both the IaaS and PaaS scaling, the maximum performance gain with the TSM can be even greater than 13.6%. The fusion of such big Sentinel and PlanetScope images requires less than 4 min in the experimental environment.

摘要

遥感卫星图像融合对于生成长期无间隙的地球观测数据至关重要。虽然云计算(CC)为遥感大数据(RSBD)提供了宏观视角,但在CC平台上高效融合RSBD这一基本问题尚未得到解决。为此,我们在本研究中提出了一个用于RSBD弹性处理的轻量级云原生框架。借助CC的基础设施即服务(IaaS)和平台即服务(PaaS)提供的扩展机制,在该框架中运行的基于Kubernetes的Spark算子模型可以提高基于Spark的算法的效率,而无需考虑诸如工作负载不平衡导致的任务延迟等瓶颈,并且可以减轻为其并行算法调整性能参数的负担。在内部,我们提出了一种任务调度机制(TSM),以动态更改Spark执行器Pod与计算主机的亲和性。TSM了解计算主机的工作负载。通过从计算主机上已完成和失败任务的数量之比中学习,TSM将Spark执行器Pod调度到更新且负载较小的计算主机上。为了说明其优势,我们实现了一个并行增强的时空自适应反射率融合模型(PESTARFM),以通过Spark聚合函数实现大尺寸遥感图像的高效融合。我们构建了一个OpenStack云计算环境来测试该框架的可用性。根据实验,仅使用PaaS扩展时,TSM可以将PESTARFM的性能提高约11.7%。当同时使用IaaS和PaaS扩展时,TSM带来的最大性能提升甚至可以超过13.6%。在实验环境中,融合如此大尺寸的哨兵卫星和行星范围卫星图像所需时间不到4分钟。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b3f5/8122984/470f1c43528c/sensors-21-02971-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b3f5/8122984/a01b99c13a87/sensors-21-02971-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b3f5/8122984/eadde5ed043d/sensors-21-02971-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b3f5/8122984/fffe4399a34c/sensors-21-02971-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b3f5/8122984/66ba6d3b1327/sensors-21-02971-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b3f5/8122984/da13877d71bf/sensors-21-02971-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b3f5/8122984/7a17484d0e26/sensors-21-02971-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b3f5/8122984/470f1c43528c/sensors-21-02971-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b3f5/8122984/a01b99c13a87/sensors-21-02971-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b3f5/8122984/eadde5ed043d/sensors-21-02971-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b3f5/8122984/fffe4399a34c/sensors-21-02971-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b3f5/8122984/66ba6d3b1327/sensors-21-02971-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b3f5/8122984/da13877d71bf/sensors-21-02971-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b3f5/8122984/7a17484d0e26/sensors-21-02971-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b3f5/8122984/470f1c43528c/sensors-21-02971-g007.jpg

相似文献

1
On-the-Fly Fusion of Remotely-Sensed Big Data Using an Elastic Computing Paradigm with a Containerized Spark Engine on Kubernetes.在Kubernetes上使用带有容器化Spark引擎的弹性计算范式对遥感大数据进行实时融合。
Sensors (Basel). 2021 Apr 23;21(9):2971. doi: 10.3390/s21092971.
2
Scheduling-Guided Automatic Processing of Massive Hyperspectral Image Classification on Cloud Computing Architectures.云计算架构上大规模高光谱图像分类的调度引导自动处理。
IEEE Trans Cybern. 2021 Jul;51(7):3588-3601. doi: 10.1109/TCYB.2020.3026673. Epub 2021 Jun 23.
3
The Flux Operator.通量算子。
F1000Res. 2024 Mar 21;13:203. doi: 10.12688/f1000research.147989.1. eCollection 2024.
4
OOSP: Opportunistic Optimization Scheme for Pod Deployment Enhanced with Multilayered Sensing.OOSP:基于多层感知增强的Pod部署机会主义优化方案
Sensors (Basel). 2024 Sep 26;24(19):6244. doi: 10.3390/s24196244.
5
A Parallel Computing Approach to Spatial Neighboring Analysis of Large Amounts of Terrain Data Using Spark.一种使用Spark对大量地形数据进行空间邻域分析的并行计算方法。
Sensors (Basel). 2021 Jan 7;21(2):365. doi: 10.3390/s21020365.
6
A distributed computing model for big data anonymization in the networks.一种用于网络大数据匿名化的分布式计算模型。
PLoS One. 2023 Apr 28;18(4):e0285212. doi: 10.1371/journal.pone.0285212. eCollection 2023.
7
Prioritized Task-Scheduling Algorithm in Cloud Computing Using Cat Swarm Optimization.基于猫群优化的云计算优先级任务调度算法。
Sensors (Basel). 2023 Jul 5;23(13):6155. doi: 10.3390/s23136155.
8
Horizontal Pod Autoscaling in Kubernetes for Elastic Container Orchestration.Kubernetes 中的水平 Pod 自动伸缩以实现弹性容器编排。
Sensors (Basel). 2020 Aug 17;20(16):4621. doi: 10.3390/s20164621.
9
On the Analysis of Inter-Relationship between Auto-Scaling Policy and QoS of FaaS Workloads.关于函数即服务(FaaS)工作负载的自动扩展策略与服务质量(QoS)之间的相互关系分析
Sensors (Basel). 2024 Jun 10;24(12):3774. doi: 10.3390/s24123774.
10
An Interface for Biomedical Big Data Processing on the Tianhe-2 Supercomputer.天河二号超级计算机上的生物医学大数据处理接口。
Molecules. 2017 Dec 1;22(12):2116. doi: 10.3390/molecules22122116.

本文引用的文献

1
Distributed Interactive Visualization Using GPU-Optimized Spark.使用GPU优化的Spark进行分布式交互式可视化
IEEE Trans Vis Comput Graph. 2021 Sep;27(9):3670-3684. doi: 10.1109/TVCG.2020.2990894. Epub 2021 Jul 29.
2
CLIJ: GPU-accelerated image processing for everyone.CLIJ:面向所有人的GPU加速图像处理。
Nat Methods. 2020 Jan;17(1):5-6. doi: 10.1038/s41592-019-0650-1.
3
Cloud computing for genomic data analysis and collaboration.云计算在基因组数据分析和协作中的应用。
Nat Rev Genet. 2018 Apr;19(4):208-219. doi: 10.1038/nrg.2017.113. Epub 2018 Jan 30.