Ahangaran Meysam, Dawalatabad Nauman, Karjadi Cody, Glass James, Au Rhoda, Kolachalama Vijaya B
Department of Medicine, Boston University Chobanian and Avedisian School of Medicine, 72 E. Concord St, Boston, MA, USA - 02118.
Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, MA, USA - 02139.
medRxiv. 2024 Nov 28:2024.11.25.24317900. doi: 10.1101/2024.11.25.24317900.
Digital voice analysis is gaining traction as a tool to differentiate cognitively normal from impaired individuals. However, voice data poses privacy risks due to the potential identification of speakers by automated systems.
We developed a framework that uses weighted linear interpolation of privacy and utility metrics to balance speaker obfuscation and cognitive integrity in cognitive assessments. This framework applies pitch-shifting for speaker obfuscation while preserving cognitive speech features. We tested it on digital voice recordings from the Framingham Heart Study (N=128) and Dementia Bank Delaware corpus (N=85), both containing responses to neuropsychological tests.
The tool effectively obfuscated speaker identity while maintaining cognitive feature integrity, achieving an accuracy of 0.6465 in classifying individuals with normal cognition, mild cognitive impairment, and dementia in the FHS cohort.
Our approach enables the development of digital markers for dementia assessment while protecting sensitive personal information, offering a scalable solution for privacy-preserving voice-based diagnostics.
数字语音分析作为一种区分认知正常个体与受损个体的工具正越来越受到关注。然而,由于自动系统可能识别说话者,语音数据存在隐私风险。
我们开发了一个框架,该框架使用隐私和效用指标的加权线性插值来平衡认知评估中说话者的模糊处理和认知完整性。此框架在保留认知语音特征的同时,应用音高转换来模糊说话者身份。我们在弗雷明汉心脏研究(N = 128)和特拉华痴呆症银行语料库(N = 85)的数字语音记录上对其进行了测试,这两个语料库均包含对神经心理学测试的回答。
该工具在保持认知特征完整性的同时有效地模糊了说话者身份,在弗雷明汉心脏研究队列中对认知正常、轻度认知障碍和痴呆个体进行分类时,准确率达到了0.6465。
我们的方法能够在保护敏感个人信息的同时开发用于痴呆症评估的数字标记,为基于语音的隐私保护诊断提供了一种可扩展的解决方案。