Suppr超能文献

利用非线性递归和分形标度特性进行语音障碍检测。

Exploiting nonlinear recurrence and fractal scaling properties for voice disorder detection.

作者信息

Little Max A, McSharry Patrick E, Roberts Stephen J, Costello Declan A E, Moroz Irene M

机构信息

Systems Analysis, Modelling and Prediction Group, Department of Engineering Science, University of Oxford, Oxford, UK.

出版信息

Biomed Eng Online. 2007 Jun 26;6:23. doi: 10.1186/1475-925X-6-23.

Abstract

BACKGROUND

Voice disorders affect patients profoundly, and acoustic tools can potentially measure voice function objectively. Disordered sustained vowels exhibit wide-ranging phenomena, from nearly periodic to highly complex, aperiodic vibrations, and increased "breathiness". Modelling and surrogate data studies have shown significant nonlinear and non-Gaussian random properties in these sounds. Nonetheless, existing tools are limited to analysing voices displaying near periodicity, and do not account for this inherent biophysical nonlinearity and non-Gaussian randomness, often using linear signal processing methods insensitive to these properties. They do not directly measure the two main biophysical symptoms of disorder: complex nonlinear aperiodicity, and turbulent, aeroacoustic, non-Gaussian randomness. Often these tools cannot be applied to more severe disordered voices, limiting their clinical usefulness.

METHODS

This paper introduces two new tools to speech analysis: recurrence and fractal scaling, which overcome the range limitations of existing tools by addressing directly these two symptoms of disorder, together reproducing a "hoarseness" diagram. A simple bootstrapped classifier then uses these two features to distinguish normal from disordered voices.

RESULTS

On a large database of subjects with a wide variety of voice disorders, these new techniques can distinguish normal from disordered cases, using quadratic discriminant analysis, to overall correct classification performance of 91.8 +/- 2.0%. The true positive classification performance is 95.4 +/- 3.2%, and the true negative performance is 91.5 +/- 2.3% (95% confidence). This is shown to outperform all combinations of the most popular classical tools.

CONCLUSION

Given the very large number of arbitrary parameters and computational complexity of existing techniques, these new techniques are far simpler and yet achieve clinically useful classification performance using only a basic classification technique. They do so by exploiting the inherent nonlinearity and turbulent randomness in disordered voice signals. They are widely applicable to the whole range of disordered voice phenomena by design. These new measures could therefore be used for a variety of practical clinical purposes.

摘要

背景

嗓音障碍对患者影响深远,声学工具有可能客观地测量嗓音功能。持续元音紊乱表现出广泛的现象,从近乎周期性到高度复杂的非周期性振动,以及“呼吸声”增加。建模和替代数据研究表明,这些声音具有显著的非线性和非高斯随机特性。尽管如此,现有工具仅限于分析显示近乎周期性的嗓音,并未考虑这种固有的生物物理非线性和非高斯随机性,通常使用对这些特性不敏感的线性信号处理方法。它们不能直接测量嗓音紊乱的两个主要生物物理症状:复杂的非线性非周期性,以及湍流、气动声学的非高斯随机性。这些工具通常无法应用于更严重的紊乱嗓音,限制了它们的临床实用性。

方法

本文介绍了两种用于语音分析的新工具:递归分析和分形标度分析,它们通过直接解决这两种紊乱症状克服了现有工具的范围限制,共同生成一个“嘶哑度”图。然后,一个简单的自助分类器使用这两个特征来区分正常嗓音和紊乱嗓音。

结果

在一个包含各种嗓音障碍患者的大型数据库上,这些新技术能够使用二次判别分析区分正常病例和紊乱病例,总体正确分类性能为91.8±2.0%。真阳性分类性能为95.4±3.2%,真阴性性能为91.5±2.3%(95%置信区间)。结果表明,这优于最流行的经典工具的所有组合。

结论

鉴于现有技术存在大量任意参数且计算复杂,这些新技术要简单得多,并且仅使用基本分类技术就能实现临床有用的分类性能。它们通过利用紊乱嗓音信号中固有的非线性和湍流随机性来做到这一点。从设计上看,它们广泛适用于整个紊乱嗓音现象范围。因此,这些新测量方法可用于各种实际临床目的。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/04d8/1913514/e9a69d955d29/1475-925X-6-23-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验