Zhu Youxiang, Tran Bang, Liang Xiaohui, Batsis John A, Roth Robert M
Department of Computer Science, University of Massachusetts Boston, MA, USA.
School of Medicine, University of North Carolina, Chapel Hill, NC, USA.
Proc IEEE Int Conf Acoust Speech Signal Process. 2022 May;2022:6462-6466. doi: 10.1109/icassp43922.2022.9747006. Epub 2022 Apr 27.
Speech pause is an effective biomarker in dementia detection. Recent deep learning models have exploited speech pauses to achieve highly accurate dementia detection, but have not exploited the interpretability of speech pauses, i.e., what and how positions and lengths of speech pauses affect the result of dementia detection. In this paper, we will study the positions and lengths of dementia-sensitive pauses using adversarial learning approaches. Specifically, we first utilize an adversarial attack approach by adding the perturbation to the speech pauses of the testing samples, aiming to reduce the confidence levels of the detection model. Then, we apply an adversarial training approach to evaluate the impact of the perturbation in training samples on the detection model. We examine the interpretability from the perspectives of model accuracy, pause context, and pause length. We found that some pauses are more sensitive to dementia than other pauses from the model's perspective, e.g., speech pauses near to the verb "is". Increasing lengths of sensitive pauses or adding sensitive pauses leads the model inference to Alzheimer's Disease (AD), while decreasing the lengths of sensitive pauses or deleting sensitive pauses leads to non-AD.
语音停顿是痴呆症检测中的一种有效生物标志物。最近的深度学习模型利用语音停顿实现了高精度的痴呆症检测,但尚未利用语音停顿的可解释性,即语音停顿的位置和时长如何以及对痴呆症检测结果有何影响。在本文中,我们将使用对抗学习方法研究对痴呆症敏感的停顿的位置和时长。具体来说,我们首先通过对测试样本的语音停顿添加扰动来利用对抗攻击方法,旨在降低检测模型的置信度。然后,我们应用对抗训练方法来评估训练样本中的扰动对检测模型的影响。我们从模型准确性、停顿上下文和停顿时长的角度研究可解释性。我们发现,从模型的角度来看,有些停顿比其他停顿对痴呆症更敏感,例如,靠近动词“is”的语音停顿。增加敏感停顿的时长或添加敏感停顿会使模型推断为阿尔茨海默病(AD),而减少敏感停顿时长或删除敏感停顿则会导致非AD。