Icahn School of Medicine at Mount Sinai, New York, New York.
Emory University, Atlanta, Georgia.
Biol Psychiatry Cogn Neurosci Neuroimaging. 2023 Oct;8(10):1005-1012. doi: 10.1016/j.bpsc.2023.06.007. Epub 2023 Jul 5.
Basic self-disturbance, or anomalous self-experiences (ASEs), is a core feature of the schizophrenia spectrum. We propose a novel method of natural language processing to quantify ASEs in spoken language by direct comparison to an inventory of self-disturbance, the Inventory of Psychotic-Like Anomalous Self-Experiences (IPASE). We hypothesized that there would be increased similarity in open-ended speech to the IPASE items in individuals with early-course psychosis (PSY) compared with healthy individuals, with clinical high-risk (CHR) individuals intermediate in similarity.
Open-ended interviews were obtained from 170 healthy control participants, 167 CHR participants, and 89 PSY participants. We calculated the semantic similarity between IPASE items and "I" sentences from transcribed speech samples using S-BERT (Sentence Bidirectional Encoder Representation from Text). Kolmogorov-Smirnov tests were used to compare distributions across groups. A nonnegative matrix factorization of cosine similarity was performed to rank IPASE items.
Spoken language of CHR individuals had the greatest semantic similarity to IPASE items when compared to both healthy control (s = 0.44, p < 10) and PSY (s = 0.36, p < 10) individuals, while IPASE scores were higher among PSY than CHR group participants. In addition, the nonnegative matrix factorization approach produced a data-driven domain that differentiated the CHR group from the others.
We found that open-ended interviews elicited language with increased semantic similarity to the IPASE by participants in the CHR group compared with patients with psychosis. This demonstrates the utility of these methods for differentiating patients from healthy control participants. This complementary approach has the capacity to scale to large studies investigating phenomenological features of schizophrenia and potentially other clinical populations.
基本的自我扰乱,或异常的自我体验(ASEs),是精神分裂症谱系的核心特征。我们提出了一种新的自然语言处理方法,通过与自我扰乱清单——精神病样异常自我体验清单(IPASE)的直接比较,来量化口语中的 ASEs。我们假设,与健康个体相比,早期精神病(PSY)个体的开放式言语与 IPASE 项目的相似性会增加,而临床高风险(CHR)个体的相似性则处于中间。
从 170 名健康对照组参与者、167 名 CHR 参与者和 89 名 PSY 参与者中获得开放式访谈。我们使用 S-BERT(来自文本的句子双向编码器表示)计算 IPASE 项目和“我”句子之间的语义相似性。使用 Kolmogorov-Smirnov 检验比较组间分布。使用余弦相似性的非负矩阵分解对 IPASE 项目进行排名。
与健康对照组(s=0.44,p<10)和 PSY 组(s=0.36,p<10)相比,CHR 个体的口语与 IPASE 项目的语义相似性最大,而 PSY 组的 IPASE 评分高于 CHR 组。此外,非负矩阵分解方法产生了一个数据驱动的领域,将 CHR 组与其他组区分开来。
我们发现,与精神病患者相比,CHR 组的参与者在开放式访谈中表现出与 IPASE 具有更高语义相似性的语言。这表明这些方法可用于区分患者和健康对照组参与者。这种互补的方法具有扩展到研究精神分裂症现象特征和潜在其他临床人群的大样本研究的能力。