Adnan Tariq, Abdelkader Abdelrahman, Liu Zipei, Hossain Ekram, Park Sooyong, Islam Md Saiful, Hoque Ehsan
Department of Computer Science, University of Rochester, Rochester, NY, USA.
Ministry of Defense Health Services, Riyadh, Saudi Arabia.
NPJ Parkinsons Dis. 2025 Jun 20;11(1):176. doi: 10.1038/s41531-025-00956-7.
We introduce a framework for screening Parkinson’s disease (PD) using English pangram utterances. Our dataset includes 1306 participants (392 with PD) from both home and clinical settings, covering diverse demographics (53.2% female). We used deep learning embeddings from Wav2Vec 2.0, WavLM, and ImageBind to capture speech dynamics indicative of PD. Our novel fusion model for PD classification aligns different speech embeddings into a cohesive feature space, outperforming baseline alternatives. In a stratified randomized split, the model achieved an AUROC of 88.9% and an accuracy of 85.7%. Statistical bias analysis showed equitable performance across sex, ethnicity, and age subgroups, with robustness across various disease durations and PD stages. Detailed error analysis revealed higher misclassification rates in specific age ranges for males and females, aligning with clinical insights. External testing yielded AUROCs of 82.1% and 78.4% on two clinical datasets, and an AUROC of 77.4% on an unseen general spontaneous English speech dataset, demonstrating versatility in natural speech analysis and potential for global accessibility and health equity.
我们介绍了一种使用英语全字母句话语来筛查帕金森病(PD)的框架。我们的数据集包括来自家庭和临床环境的1306名参与者(392名患有PD),涵盖了不同的人口统计学特征(53.2%为女性)。我们使用了来自Wav2Vec 2.0、WavLM和ImageBind的深度学习嵌入来捕捉指示PD的语音动态。我们用于PD分类的新型融合模型将不同的语音嵌入对齐到一个连贯的特征空间中,优于基线替代模型。在分层随机分割中,该模型的曲线下面积(AUROC)达到88.9%,准确率达到85.7%。统计偏差分析表明,该模型在性别、种族和年龄亚组中的表现公平,在不同疾病持续时间和PD阶段具有稳健性。详细的错误分析显示,男性和女性在特定年龄范围内的误分类率较高,这与临床见解一致。在两个临床数据集上进行外部测试时,曲线下面积分别为82.1%和78.4%,在一个未见过的一般自然英语语音数据集上的曲线下面积为77.4%,这表明该模型在自然语音分析中具有通用性,具有全球可及性和健康公平性的潜力。