Gençağa Deniz
Department of Electrical and Electronics Engineering, Antalya Bilim University, Antalya 07190, Turkey.
Language Technologies Institute, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA.
Entropy (Basel). 2023 Nov 23;25(12):1577. doi: 10.3390/e25121577.
This paper provides a methodology to better understand the relationships between different aspects of vocal fold motion, which are used as features in machine learning-based approaches for detecting respiratory infections from voice recordings. The relationships are derived through a joint multivariate analysis of the vocal fold oscillations of speakers. Specifically, the multivariate setting explores the displacements and velocities of the left and right vocal folds derived from recordings of five extended vowel sounds for each speaker (/aa/, /iy/, /ey/, /uw/, and /ow/). In this multivariate setting, the differences between the bivariate and conditional interactions are analyzed by information-theoretic quantities based on transfer entropy. Incorporation of the conditional quantities reveals information regarding the confounding factors that can influence the statistical interactions among other pairs of variables. This is demonstrated on a vector autoregressive process where the analytical derivations can be carried out. As a proof of concept, the methodology is applied on a clinically curated dataset of COVID-19. The findings suggest that the interaction between the vocal fold oscillations can change according to individuals and presence of any respiratory infection, such as COVID-19. The results are important in the sense that the proposed approach can be utilized to determine the selection of appropriate features as a supplementary or early detection tool in voice-based diagnostics in future studies.
本文提供了一种方法,以更好地理解声带运动不同方面之间的关系,这些关系在基于机器学习的方法中用作从语音记录中检测呼吸道感染的特征。这些关系是通过对说话者声带振荡的联合多变量分析得出的。具体而言,多变量设置探索了从每个说话者的五个延长元音(/aa/、/iy/、/ey/、/uw/和/ow/)记录中得出的左右声带的位移和速度。在这种多变量设置中,基于转移熵的信息论量分析了双变量和条件相互作用之间的差异。纳入条件量揭示了有关可能影响其他变量对之间统计相互作用的混杂因素的信息。这在一个可以进行分析推导的向量自回归过程中得到了证明。作为概念验证,该方法应用于一个经过临床整理的COVID-19数据集。研究结果表明,声带振荡之间的相互作用会根据个体和任何呼吸道感染(如COVID-19)的存在而变化。从这个意义上说,结果很重要,因为所提出的方法可用于确定在未来研究中作为基于语音的诊断中的补充或早期检测工具的适当特征的选择。