Šubert Martin, Novotný Michal, Tykalová Tereza, Srpová Barbora, Friedová Lucie, Uher Tomáš, Horáková Dana, Rusz Jan
Department of Circuit Theory, Faculty of Electrical Engineering, Czech Technical University in Prague, Prague, Czech Republic.
Department of Neurology and Centre of Clinical Neuroscience, First Faculty of Medicine, Charles University and General University Hospital, Prague, Czech Republic.
Ther Adv Neurol Disord. 2023 Jun 25;16:17562864231180719. doi: 10.1177/17562864231180719. eCollection 2023.
Impairment of higher language functions associated with natural spontaneous speech in multiple sclerosis (MS) remains underexplored.
We presented a fully automated method for discriminating MS patients from healthy controls based on lexical and syntactic linguistic features.
We enrolled 120 MS individuals with Expanded Disability Status Scale ranging from 1 to 6.5 and 120 age-, sex-, and education-matched healthy controls. Linguistic analysis was performed with fully automated methods based on automatic speech recognition and natural language processing techniques using eight lexical and syntactic features acquired from the spontaneous discourse. Fully automated annotations were compared with human annotations.
Compared with healthy controls, lexical impairment in MS consisted of an increase in content words ( = 0.037), a decrease in function words ( = 0.007), and overuse of verbs at the expense of noun ( = 0.047), while syntactic impairment manifested as shorter utterance length ( = 0.002), and low number of coordinate clause ( < 0.001). A fully automated language analysis approach enabled discrimination between MS and controls with an area under the curve of 0.70. A significant relationship was detected between shorter utterance length and lower symbol digit modalities test score ( = 0.25, = 0.008). Strong associations between a majority of automatically and manually computed features were observed ( > 0.88, < 0.001).
Automated discourse analysis has the potential to provide an easy-to-implement and low-cost language-based biomarker of cognitive decline in MS for future clinical trials.
多发性硬化症(MS)中与自然自发言语相关的高级语言功能损害仍未得到充分研究。
我们提出了一种基于词汇和句法语言特征区分MS患者与健康对照的全自动方法。
我们招募了120名扩展残疾状态量表评分在1至6.5之间的MS患者以及120名年龄、性别和教育程度相匹配的健康对照。使用基于自动语音识别和自然语言处理技术的全自动方法进行语言分析,该方法利用从自发话语中获取的八个词汇和句法特征。将全自动注释与人工注释进行比较。
与健康对照相比,MS患者的词汇损害包括实词增加(=0.037)、虚词减少(=0.007)以及以牺牲名词为代价过度使用动词(=0.047),而句法损害表现为话语长度较短(=0.002)和并列从句数量较少(<0.001)。一种全自动语言分析方法能够区分MS患者和对照,曲线下面积为0.70。话语长度较短与较低的符号数字模态测试得分之间存在显著关系(=0.25,=0.008)。观察到大多数自动计算和人工计算的特征之间存在强关联(>0.88,<0.001)。
自动话语分析有可能为未来的临床试验提供一种易于实施且低成本的基于语言的MS认知衰退生物标志物。