• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用分类和可解释性模型深入了解超高风险患者的言语特征。

Deeper insight into speech characteristics of patients at ultra-high risk using classification and explainability models.

作者信息

Kim-Dufor Deok-Hee, Walter Michel, Krebs Marie-Odile, Haralambous Yannis, Lenca Philippe, Lemey Christophe

机构信息

Limics, Sorbonne Université, Université Sorbonne Paris-Nord, INSERM, Paris, France.

Unité de Recherche Clinique en Psychiatrie (URCP), Department of Psychiatry, Centre Hospitalier Universitaire (CHU) de Brest, Brest, France.

出版信息

Front Psychiatry. 2025 Jun 16;16:1595197. doi: 10.3389/fpsyt.2025.1595197. eCollection 2025.

DOI:10.3389/fpsyt.2025.1595197
PMID:40589653
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12206705/
Abstract

INTRODUCTION

Peculiar use of language and even language deficits are one of the well-known signs of schizophrenia. Different language features analyzed using natural language processing and machine learning have been reported to differentiate patients at ultra-high risk for psychosis. However, it has not always been explained how, and to what extent, those linguistic markers allow the distinction of patients. This study aims to find relevant linguistic markers for classifying patients at ultra-high risk and explain how the detected markers contribute to the classification.

METHODS

The first consultations with a psychiatrist of 68 patients (15 not-at-risk patients, 45 at-risk patients, and 8 patients with first episode psychosis) were recorded, transcribed verbatim, and annotated for analyses using natural language processing. A gradient-boosted decision tree algorithm was tested to evaluate its potential to correctly classify three categories of patients and find relevant linguistic markers at the level of lexical richness, semantic coherence, speech disfluency, and syntactic complexity. The Synthetic Minority Oversampling Technique was used to handle imbalanced data, and the SHapley Additive exPlanations (SHAP) values were computed to measure feature importance and each feature's contributions to the classification.

RESULTS

The model yielded good performance, that is, 0.82 accuracy, 0.82 F2-score, 0.85 precision, 0.82 recall, and 0.86 ROC-AUC score, with four linguistic variables that concern weak coherence, the use of "I," and filled pauses.

DISCUSSION

The findings in this study suggest that weak coherence play a key role in classification. No significant differences in the use of "I" and filled pauses were found between groups using a statistical test, but an explainability model showed its different contributions. The contribution of each linguistic feature to the classification by patient group provided deeper insight into linguistic manifestations of each patient group and their subtle differences, which could help better analyze and understand patients' language behaviors.

摘要

引言

语言的特殊运用甚至语言缺陷是精神分裂症的著名症状之一。据报道,使用自然语言处理和机器学习分析的不同语言特征能够区分处于精神病超高风险的患者。然而,这些语言标记如何以及在多大程度上能够区分患者,却并未总是得到解释。本研究旨在寻找用于对超高风险患者进行分类的相关语言标记,并解释检测到的标记如何有助于分类。

方法

记录了68名患者(15名非风险患者、45名风险患者和8名首次发作精神病患者)与精神科医生的首次会诊,逐字转录,并使用自然语言处理进行注释以便分析。测试了梯度提升决策树算法,以评估其正确分类三类患者的潜力,并在词汇丰富度、语义连贯、言语不流畅和句法复杂性层面找到相关语言标记。使用合成少数过采样技术处理不平衡数据,并计算夏普利值(SHapley Additive exPlanations,SHAP)以衡量特征重要性以及每个特征对分类的贡献。

结果

该模型表现良好,即准确率为0.82、F2分数为0.82、精确率为0.85、召回率为0.82、ROC-AUC分数为0.86,有四个语言变量与连贯性弱、“我”的使用和填充停顿有关。

讨论

本研究结果表明,连贯性弱在分类中起关键作用。使用统计检验未发现各组在“我”的使用和填充停顿方面存在显著差异,但一个可解释性模型显示了它们不同的贡献。每个语言特征对患者组分类的贡献为深入了解每个患者组的语言表现及其细微差异提供了帮助,这有助于更好地分析和理解患者的语言行为。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e0fe/12206705/a48791e25da5/fpsyt-16-1595197-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e0fe/12206705/1d11fb0b10ed/fpsyt-16-1595197-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e0fe/12206705/af861a4c55e3/fpsyt-16-1595197-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e0fe/12206705/c7fb2c0e2c01/fpsyt-16-1595197-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e0fe/12206705/0e1ab802eff2/fpsyt-16-1595197-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e0fe/12206705/00e4cbc92696/fpsyt-16-1595197-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e0fe/12206705/a48791e25da5/fpsyt-16-1595197-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e0fe/12206705/1d11fb0b10ed/fpsyt-16-1595197-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e0fe/12206705/af861a4c55e3/fpsyt-16-1595197-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e0fe/12206705/c7fb2c0e2c01/fpsyt-16-1595197-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e0fe/12206705/0e1ab802eff2/fpsyt-16-1595197-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e0fe/12206705/00e4cbc92696/fpsyt-16-1595197-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e0fe/12206705/a48791e25da5/fpsyt-16-1595197-g006.jpg

相似文献

1
Deeper insight into speech characteristics of patients at ultra-high risk using classification and explainability models.使用分类和可解释性模型深入了解超高风险患者的言语特征。
Front Psychiatry. 2025 Jun 16;16:1595197. doi: 10.3389/fpsyt.2025.1595197. eCollection 2025.
2
The clinical effectiveness and cost-effectiveness of enzyme replacement therapy for Gaucher's disease: a systematic review.戈谢病酶替代疗法的临床疗效和成本效益:一项系统评价。
Health Technol Assess. 2006 Jul;10(24):iii-iv, ix-136. doi: 10.3310/hta10240.
3
Intravenous magnesium sulphate and sotalol for prevention of atrial fibrillation after coronary artery bypass surgery: a systematic review and economic evaluation.静脉注射硫酸镁和索他洛尔预防冠状动脉搭桥术后房颤:系统评价与经济学评估
Health Technol Assess. 2008 Jun;12(28):iii-iv, ix-95. doi: 10.3310/hta12280.
4
Surveillance of Barrett's oesophagus: exploring the uncertainty through systematic review, expert workshop and economic modelling.巴雷特食管的监测:通过系统评价、专家研讨会和经济模型探索不确定性
Health Technol Assess. 2006 Mar;10(8):1-142, iii-iv. doi: 10.3310/hta10080.
5
Sertindole for schizophrenia.用于治疗精神分裂症的舍吲哚。
Cochrane Database Syst Rev. 2005 Jul 20;2005(3):CD001715. doi: 10.1002/14651858.CD001715.pub2.
6
The effectiveness and cost-effectiveness of carmustine implants and temozolomide for the treatment of newly diagnosed high-grade glioma: a systematic review and economic evaluation.卡莫司汀植入剂与替莫唑胺治疗新诊断的高级别胶质瘤的有效性和成本效益:一项系统评价与经济学评估
Health Technol Assess. 2007 Nov;11(45):iii-iv, ix-221. doi: 10.3310/hta11450.
7
Home treatment for mental health problems: a systematic review.心理健康问题的居家治疗:一项系统综述
Health Technol Assess. 2001;5(15):1-139. doi: 10.3310/hta5150.
8
The educational effects of portfolios on undergraduate student learning: a Best Evidence Medical Education (BEME) systematic review. BEME Guide No. 11.档案袋对本科学生学习的教育效果:最佳证据医学教育(BEME)系统评价。BEME指南第11号。
Med Teach. 2009 Apr;31(4):282-98. doi: 10.1080/01421590902889897.
9
A systematic review of speech, language and communication interventions for children with Down syndrome from 0 to 6 years.对0至6岁唐氏综合征儿童言语、语言和沟通干预措施的系统评价。
Int J Lang Commun Disord. 2022 Mar;57(2):441-463. doi: 10.1111/1460-6984.12699. Epub 2022 Feb 22.
10
Maternal and neonatal outcomes of elective induction of labor.择期引产的母婴结局
Evid Rep Technol Assess (Full Rep). 2009 Mar(176):1-257.

本文引用的文献

1
Detection of Parkinson disease using multiclass machine learning approach.使用多类机器学习方法检测帕金森病。
Sci Rep. 2024 Jun 15;14(1):13813. doi: 10.1038/s41598-024-64004-9.
2
Evidence for the factor structure of formal thought disorder: A systematic review.形式思维障碍的因子结构的证据:系统综述。
Schizophr Res. 2024 Feb;264:424-434. doi: 10.1016/j.schres.2024.01.006. Epub 2024 Jan 19.
3
Linguistic findings in persons with schizophrenia-a review of the current literature.精神分裂症患者的语言研究结果——当前文献综述
Front Psychol. 2023 Nov 21;14:1287706. doi: 10.3389/fpsyg.2023.1287706. eCollection 2023.
4
Improving Prediction of Cervical Cancer Using KNN Imputed SMOTE Features and Multi-Model Ensemble Learning Approach.使用K近邻插补合成少数过采样技术特征和多模型集成学习方法改善宫颈癌预测
Cancers (Basel). 2023 Sep 4;15(17):4412. doi: 10.3390/cancers15174412.
5
Lower cohesion and altered first-person pronoun usage in the spoken life narratives of individuals with schizophrenia.精神分裂症患者口语生活叙事中的内聚性降低和第一人称代词使用的改变。
Schizophr Res. 2023 Sep;259:140-149. doi: 10.1016/j.schres.2023.04.001. Epub 2023 Apr 29.
6
Correlation between duration of untreated psychosis and long-term prognosis in chronic schizophrenia.慢性精神分裂症未治疗精神病期与长期预后的相关性
Front Psychiatry. 2023 Feb 16;14:1112657. doi: 10.3389/fpsyt.2023.1112657. eCollection 2023.
7
Deconstructing heterogeneity in schizophrenia through language: a semi-automated linguistic analysis and data-driven clustering approach.通过语言剖析精神分裂症的异质性:一种半自动语言分析和数据驱动聚类方法。
Schizophrenia (Heidelb). 2022 Nov 29;8(1):102. doi: 10.1038/s41537-022-00306-z.
8
Semantic and Acoustic Markers in Schizophrenia-Spectrum Disorders: A Combinatory Machine Learning Approach.精神分裂症谱系障碍的语义和声学标记:组合机器学习方法。
Schizophr Bull. 2023 Mar 22;49(Suppl_2):S163-S171. doi: 10.1093/schbul/sbac142.
9
Emergence of Language Related to Self-experience and Agency in Autobiographical Narratives of Individuals With Schizophrenia.精神分裂症个体自传体叙事中与自我体验和能动性相关的语言的出现。
Schizophr Bull. 2023 Mar 15;49(2):444-453. doi: 10.1093/schbul/sbac126.
10
Optimized XGBoost Model with Small Dataset for Predicting Relative Density of Ti-6Al-4V Parts Manufactured by Selective Laser Melting.用于预测选择性激光熔化制造的Ti-6Al-4V零件相对密度的小数据集优化XGBoost模型
Materials (Basel). 2022 Aug 1;15(15):5298. doi: 10.3390/ma15155298.