MIT Lincoln Laboratory, Lexington, MA, USA.
Speech and Hearing Bioscience and Technology, Harvard Medical School, Boston, MA, USA.
Sci Rep. 2023 Jan 28;13(1):1567. doi: 10.1038/s41598-023-27934-4.
In the face of the global pandemic caused by the disease COVID-19, researchers have increasingly turned to simple measures to detect and monitor the presence of the disease in individuals at home. We sought to determine if measures of neuromotor coordination, derived from acoustic time series, as well as phoneme-based and standard acoustic features extracted from recordings of simple speech tasks could aid in detecting the presence of COVID-19. We further hypothesized that these features would aid in characterizing the effect of COVID-19 on speech production systems. A protocol, consisting of a variety of speech tasks, was administered to 12 individuals with COVID-19 and 15 individuals with other viral infections at University Hospital Galway. From these recordings, we extracted a set of acoustic time series representative of speech production subsystems, as well as their univariate statistics. The time series were further utilized to derive correlation-based features, a proxy for speech production motor coordination. We additionally extracted phoneme-based features. These features were used to create machine learning models to distinguish between the COVID-19 positive and other viral infection groups, with respiratory- and laryngeal-based features resulting in the highest performance. Coordination-based features derived from harmonic-to-noise ratio time series from read speech discriminated between the two groups with an area under the ROC curve (AUC) of 0.94. A longitudinal case study of two subjects, one from each group, revealed differences in laryngeal based acoustic features, consistent with observed physiological differences between the two groups. The results from this analysis highlight the promise of using nonintrusive sensing through simple speech recordings for early warning and tracking of COVID-19.
在由疾病 COVID-19 引起的全球大流行面前,研究人员越来越多地转向简单的措施,以便在家中个体中检测和监测疾病的存在。我们试图确定源自声学时序的神经运动协调测量值,以及从简单言语任务记录中提取的基于音素和标准声学特征是否有助于检测 COVID-19 的存在。我们进一步假设这些特征将有助于表征 COVID-19 对言语产生系统的影响。在戈尔韦大学医院,一项由各种言语任务组成的方案对 12 名 COVID-19 患者和 15 名患有其他病毒感染的个体进行了测试。从这些记录中,我们提取了一组代表言语产生子系统及其单变量统计数据的声学时序列。进一步利用时间序列得出基于相关的特征,作为言语产生运动协调的代理。我们还提取了基于音素的特征。这些特征用于创建机器学习模型以区分 COVID-19 阳性和其他病毒感染组,基于呼吸和喉部的特征表现出最高的性能。从朗读言语的谐噪比时序列得出的基于协调的特征可将两组区分开来,ROC 曲线下的面积(AUC)为 0.94。对来自两个组中的每个组的两个对象的纵向病例研究显示,基于喉部的声学特征存在差异,与两组之间观察到的生理差异一致。该分析的结果强调了使用通过简单言语记录进行非侵入性感测来对 COVID-19 进行早期预警和跟踪的潜力。