Neumann Michael, Kothare Hardik, Ramanarayanan Vikram
Modality.AI, Inc., San Francisco, USA.
Interspeech. 2023 Aug;2023:2353-2357. doi: 10.21437/interspeech.2023-2100.
Multiple speech biomarkers have been shown to carry useful information regarding Amyotrophic Lateral Sclerosis (ALS) pathology. We propose a two-step framework to compute optimal linear combinations (indexes) of these biomarkers that are more discriminative and noise-robust than the individual markers, which is important for clinical care and pharmaceutical trial applications. First, we use a hierarchical clustering based method to select representative speech metrics from a dataset comprising 143 people with ALS and 135 age- and sex-matched healthy controls. Second, we analyze three methods of index computation that optimize linear discriminability, Youden Index, and sparsity of logistic regression model weights, respectively, and evaluate their performance with 5-fold cross validation. We find that the proposed indexes are generally more discriminative of bulbar vs non-bulbar onset in ALS than their individual component metrics as well as an equally-weighted baseline.
多种言语生物标志物已被证明携带有关肌萎缩侧索硬化症(ALS)病理学的有用信息。我们提出了一个两步框架来计算这些生物标志物的最佳线性组合(指标),这些组合比单个标志物更具区分性且对噪声更具鲁棒性,这对于临床护理和药物试验应用很重要。首先,我们使用一种基于层次聚类的方法从一个包含143例ALS患者和135例年龄及性别匹配的健康对照的数据集里选择有代表性的言语指标。其次,我们分析了三种分别优化线性可区分性、约登指数和逻辑回归模型权重稀疏性的指标计算方法,并通过五折交叉验证评估它们的性能。我们发现,所提出的指标通常比其单个组成指标以及等权重基线更能区分ALS中的延髓性发病与非延髓性发病。