Emvoliadis Alexandros, Vryzas Nikolaos, Stamatiadou Marina-Eirini, Vrysis Lazaros, Dimoulas Charalampos
Multidisciplinary Media & Mediated Communication Research Group (M3C), Aristotle University, 54636 Thessaloniki, Greece.
Sensors (Basel). 2024 Apr 26;24(9):2755. doi: 10.3390/s24092755.
This study presents a novel audio compression technique, tailored for environmental monitoring within multi-modal data processing pipelines. Considering the crucial role that audio data play in environmental evaluations, particularly in contexts with extreme resource limitations, our strategy substantially decreases bit rates to facilitate efficient data transfer and storage. This is accomplished without undermining the accuracy necessary for trustworthy air pollution analysis while simultaneously minimizing processing expenses. More specifically, our approach fuses a Deep-Learning-based model, optimized for edge devices, along with a conventional coding schema for audio compression. Once transmitted to the cloud, the compressed data undergo a decoding process, leveraging vast cloud computing resources for accurate reconstruction and classification. The experimental results indicate that our approach leads to a relatively minor decrease in accuracy, even at notably low bit rates, and demonstrates strong robustness in identifying data from labels not included in our training dataset.
本研究提出了一种新颖的音频压缩技术,专为多模态数据处理管道中的环境监测量身定制。考虑到音频数据在环境评估中所起的关键作用,尤其是在资源极度有限的情况下,我们的策略大幅降低了比特率,以促进高效的数据传输和存储。在不损害可靠空气污染分析所需准确性的前提下实现了这一点,同时将处理成本降至最低。更具体地说,我们的方法将针对边缘设备优化的基于深度学习的模型与传统音频压缩编码方案相结合。压缩数据一旦传输到云端,便会经历解码过程,利用大量云计算资源进行准确的重建和分类。实验结果表明,即使在极低的比特率下,我们的方法导致的准确性下降也相对较小,并且在识别来自我们训练数据集中未包含标签的数据时表现出强大的稳健性。