Huang Rong, Yi Siqi, Chen Jie, Chan Kit Ying, Chan Joey Wing Yan, Chan Ngan Yin, Li Shirley Xin, Wing Yun Kwok, Li Tim Man Ho
Li Chiu Kong Family Sleep Assessment Unit, Department of Psychiatry, The Chinese University of Hong Kong, Hong Kong, China.
Division of Psychology and Language Sciences, University College London, London WC1E 6BT, UK.
Behav Sci (Basel). 2024 Mar 11;14(3):225. doi: 10.3390/bs14030225.
Linguistic features, particularly the use of first-person singular pronouns (FPSPs), have been identified as potential indicators of suicidal ideation. Machine learning (ML) and natural language processing (NLP) have shown potential in suicide detection, but their clinical applicability remains underexplored. This study aimed to identify linguistic features associated with suicidal ideation and develop ML models for detection. NLP techniques were applied to clinical interview transcripts ( = 319) to extract relevant features, including four cases of FPSP (subjective, objective, dative, and possessive cases) and first-person plural pronouns (FPPPs). Logistic regression analyses were conducted for each linguistic feature, controlling for age, gender, and depression. Gradient boosting, support vector machine, random forest, decision tree, and logistic regression were trained and evaluated. Results indicated that all four cases of FPSPs were associated with depression ( < 0.05) but only the use of objective FPSPs was significantly associated with suicidal ideation ( = 0.02). Logistic regression and support vector machine models successfully detected suicidal ideation, achieving an area under the curve (AUC) of 0.57 ( < 0.05). In conclusion, FPSPs identified during clinical interviews might be a promising indicator of suicidal ideation in Chinese patients. ML algorithms might have the potential to aid clinicians in improving the detection of suicidal ideation in clinical settings.
语言特征,尤其是第一人称单数代词(FPSPs)的使用,已被确定为自杀意念的潜在指标。机器学习(ML)和自然语言处理(NLP)在自杀检测方面已显示出潜力,但其临床适用性仍未得到充分探索。本研究旨在识别与自杀意念相关的语言特征,并开发用于检测的ML模型。将NLP技术应用于临床访谈记录(n = 319)以提取相关特征,包括四种第一人称单数代词情况(主观、客观、与格和所有格)以及第一人称复数代词(FPPPs)。对每个语言特征进行逻辑回归分析,同时控制年龄、性别和抑郁情况。对梯度提升、支持向量机、随机森林、决策树和逻辑回归进行了训练和评估。结果表明,所有四种第一人称单数代词情况均与抑郁相关(P < 0.05),但只有客观第一人称单数代词的使用与自杀意念显著相关(P = 0.02)。逻辑回归和支持向量机模型成功检测出自杀意念,曲线下面积(AUC)为0.57(P < 0.05)。总之,临床访谈中识别出的第一人称单数代词可能是中国患者自杀意念的一个有前景的指标。ML算法可能有潜力帮助临床医生在临床环境中提高对自杀意念检测的能力。