Suppr超能文献

基于云服务的远程分布式声学传感(DAS)中实时数据存储与信号处理的设计与评估

Design and Evaluation of Real-Time Data Storage and Signal Processing in a Long-Range Distributed Acoustic Sensing (DAS) Using Cloud-Based Services.

作者信息

Nur Abdusomad, Muanenda Yonas

机构信息

Addis Ababa Institute of Technology, Addis Ababa University, King George VI St, Addis Ababa 1000, Ethiopia.

Institute of Mechanical Intelligence, Scuola Superiore Sant'Anna, Via G. Moruzzi 1, 56124 Pisa, Italy.

出版信息

Sensors (Basel). 2024 Sep 13;24(18):5948. doi: 10.3390/s24185948.

Abstract

In cloud-based Distributed Acoustic Sensing (DAS) sensor data management, we are confronted with two primary challenges. First, the development of efficient storage mechanisms capable of handling the enormous volume of data generated by these sensors poses a challenge. To solve this issue, we propose a method to address the issue of handling the large amount of data involved in DAS by designing and implementing a pipeline system to efficiently send the big data to DynamoDB in order to fully use the low latency of the DynamoDB data storage system for a benchmark DAS scheme for performing continuous monitoring over a 100 km range at a meter-scale spatial resolution. We employ the DynamoDB functionality of Amazon Web Services (AWS), which allows highly expandable storage capacity with latency of access of a few tens of milliseconds. The different stages of DAS data handling are performed in a pipeline, and the scheme is optimized for high overall throughput with reduced latency suitable for concurrent, real-time event extraction as well as the minimal storage of raw and intermediate data. In addition, the scalability of the DynamoDB-based data storage scheme is evaluated for linear and nonlinear variations of number of batches of access and a wide range of data sample sizes corresponding to sensing ranges of 1-110 km. The results show latencies of 40 ms per batch of access with low standard deviations of a few milliseconds, and latency per sample decreases for increasing the sample size, paving the way toward the development of scalable, cloud-based data storage services integrating additional post-processing for more precise feature extraction. The technique greatly simplifies DAS data handling in key application areas requiring continuous, large-scale measurement schemes. In addition, the processing of raw traces in a long-distance DAS for real-time monitoring requires the careful design of computational resources to guarantee requisite dynamic performance. Now, we will focus on the design of a system for the performance evaluation of cloud computing systems for diverse computations on DAS data. This system is aimed at unveiling valuable insights into performance metrics and operational efficiencies of computations on the data in the cloud, which will provide a deeper understanding of the system's performance, identify potential bottlenecks, and suggest areas for improvement. To achieve this, we employ the CloudSim framework. The analysis reveals that the virtual machine (VM) performance decreases significantly the processing time with more capable VMs, influenced by Processing Elements (PEs) and Million Instructions Per Second (MIPS). The results also reflect that, although a larger number of computations is required as the fiber length increases, with the subsequent increase in processing time, the overall speed of computation is still suitable for continuous real-time monitoring. We also see that VMs with lower performance in terms of processing speed and number of CPUs have more inconsistent processing times compared to those with higher performance, while not incurring significantly higher prices. Additionally, the impact of VM parameters on computation time is explored, highlighting the importance of resource optimization in the DAS system design for efficient performance. The study also observes a notable trend in processing time, showing a significant decrease for every additional 50,000 columns processed as the length of the fiber increases. This finding underscores the efficiency gains achieved with larger computational loads, indicating improved system performance and capacity utilization as the DAS system processes more extensive datasets.

摘要

在基于云的分布式声学传感(DAS)传感器数据管理中,我们面临两个主要挑战。首先,开发能够处理这些传感器产生的大量数据的高效存储机制是一项挑战。为了解决这个问题,我们提出了一种方法,通过设计和实现一个管道系统来解决DAS中涉及的大量数据处理问题,以便将大数据高效地发送到DynamoDB,从而充分利用DynamoDB数据存储系统的低延迟,用于在100公里范围内以米级空间分辨率进行连续监测的基准DAS方案。我们采用亚马逊网络服务(AWS)的DynamoDB功能,它允许具有数十毫秒访问延迟的高度可扩展存储容量。DAS数据处理的不同阶段在管道中执行,该方案针对高总体吞吐量进行了优化,具有降低的延迟,适用于并发、实时事件提取以及原始和中间数据的最小存储。此外,针对基于DynamoDB的数据存储方案的可扩展性,评估了访问批次数量的线性和非线性变化以及对应于1 - 110公里传感范围的各种数据样本大小。结果显示,每批访问的延迟为40毫秒,标准差低至几毫秒,并且随着样本大小的增加,每个样本的延迟会降低,这为开发可扩展的、基于云的数据存储服务铺平了道路,该服务集成了额外的后处理以进行更精确的特征提取。该技术极大地简化了在需要连续、大规模测量方案的关键应用领域中的DAS数据处理。此外,在长距离DAS中进行实时监测时对原始迹线的处理需要仔细设计计算资源,以保证所需的动态性能。现在,我们将专注于设计一个用于对DAS数据进行各种计算的云计算系统性能评估的系统。该系统旨在揭示有关云中数据计算的性能指标和运营效率的有价值见解,这将提供对系统性能的更深入理解,识别潜在瓶颈,并提出改进领域。为了实现这一目标,我们采用CloudSim框架。分析表明,受处理元素(PE)和每秒百万条指令(MIPS)影响,更强大的虚拟机(VM)会显著减少处理时间。结果还表明,尽管随着光纤长度的增加需要进行更多的计算,随后处理时间也会增加,但总体计算速度仍然适合连续实时监测。我们还发现,与性能较高的VM相比,处理速度和CPU数量较低的VM的处理时间更不一致,同时价格不会显著更高。此外,还探讨了VM参数对计算时间的影响,突出了在DAS系统设计中进行资源优化以实现高效性能的重要性。该研究还观察到处理时间的一个显著趋势,即随着光纤长度的增加,每额外处理50,000列,处理时间会显著减少。这一发现强调了随着计算负载增加所实现的效率提升,表明随着DAS系统处理更广泛的数据集,系统性能和容量利用率得到了改善。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5ad2/11436233/18b2f2d035ab/sensors-24-05948-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验