Biomedical Engineering Department of University of North Carolina at Chapel Hill, Chapel Hill, NC, United States of America.
SLB Consulting, Winton, Cumbria, United Kingdom.
PLoS One. 2023 Aug 10;18(8):e0283953. doi: 10.1371/journal.pone.0283953. eCollection 2023.
Doppler ultrasound (DU) is used in decompression research to detect venous gas emboli in the precordium or subclavian vein, as a marker of decompression stress. This is of relevance to scuba divers, compressed air workers and astronauts to prevent decompression sickness (DCS) that can be caused by these bubbles upon or after a sudden reduction in ambient pressure. Doppler ultrasound data is graded by expert raters on the Kisman-Masurel or Spencer scales that are associated to DCS risk. Meta-analyses, as well as efforts to computer-automate DU grading, both necessitate access to large databases of well-curated and graded data. Leveraging previously collected data is especially important due to the difficulty of repeating large-scale extreme military pressure exposures that were conducted in the 70-90s in austere environments. Historically, DU data (Non-speech) were often captured on cassettes in one-channel audio with superimposed human speech describing the experiment (Speech). Digitizing and separating these audio files is currently a lengthy, manual task. In this paper, we develop a graphical user interface (GUI) to perform automatic speech recognition and aid in Non-speech and Speech separation. This constitutes the first study incorporating speech processing technology in the field of diving research. If successful, it has the potential to significantly accelerate the reuse of previously-acquired datasets. The recognition task incorporates the Google speech recognizer to detect the presence of human voice activity together with corresponding timestamps. The detected human speech is then separated from the audio Doppler ultrasound within the developed GUI. Several experiments were conducted on recently digitized audio Doppler recordings to corroborate the effectiveness of the developed GUI in recognition and separations tasks, and these are compared to manual labels for Speech timestamps. The following metrics are used to evaluate performance: the average absolute differences between the reference and detected Speech starting points, as well as the percentage of detected Speech over the total duration of the reference Speech. Results have shown the efficacy of the developed GUI in Speech/Non-speech component separation.
多普勒超声(DU)用于减压研究中,以检测前胸或锁骨下静脉中的静脉气体栓塞,作为减压应激的标志物。这与水肺潜水员、压缩空气工人和宇航员有关,以防止减压病(DCS),这些气泡会在环境压力突然降低时或之后引起减压病。多普勒超声数据由专家评估员根据 Kisman-Masurel 或 Spencer 量表进行分级,这些量表与 DCS 风险相关。荟萃分析以及计算机自动 DU 分级的努力都需要访问精心管理和分级的数据的大型数据库。由于在 70-90 年代在艰苦环境中进行的大规模极端军事压力暴露难以重复,因此利用以前收集的数据尤为重要。历史上,DU 数据(非语音)通常以单声道音频盒式带上的录音形式捕获,并且叠加了描述实验的人类语音(语音)。目前,数字化和分离这些音频文件是一项冗长的手动任务。在本文中,我们开发了一个图形用户界面(GUI),以执行自动语音识别并帮助分离非语音和语音。这是首次在潜水研究领域采用语音处理技术。如果成功,它有可能显著加快对以前获取的数据集的重用。识别任务结合了 Google 语音识别器,以检测人类语音活动的存在及其相应的时间戳。然后,在开发的 GUI 中,从音频多普勒超声中分离出检测到的人类语音。对最近数字化的音频多普勒记录进行了几项实验,以验证开发的 GUI 在识别和分离任务中的有效性,并将其与语音时间戳的手动标签进行比较。用于评估性能的指标包括:参考和检测到的语音起点之间的平均绝对差异,以及检测到的语音占参考语音总持续时间的百分比。结果表明,开发的 GUI 在语音/非语音组件分离方面非常有效。