Wiegersma Sytske, Hidajat Maurice, Schrieken Bart, Veldkamp Bernard, Olff Miranda
Department of Research Methodology, Measurement and Data Analysis, University of Twente, Enschede, Netherlands.
Interapy PLC, Amsterdam, Netherlands.
JMIR Ment Health. 2022 Apr 11;9(4):e21111. doi: 10.2196/21111.
Text mining and machine learning are increasingly used in mental health care practice and research, potentially saving time and effort in the diagnosis and monitoring of patients. Previous studies showed that mental disorders can be detected based on text, but they focused on screening for a single predefined disorder instead of multiple disorders simultaneously.
The aim of this study is to develop a Dutch multi-class text-classification model to screen for a range of mental disorders to refer new patients to the most suitable treatment.
On the basis of textual responses of patients (N=5863) to a questionnaire currently used for intake and referral, a 7-class classifier was developed to distinguish among anxiety, panic, posttraumatic stress, mood, eating, substance use, and somatic symptom disorders. A linear support vector machine was fitted using nested cross-validation grid search.
The highest classification rate was found for eating disorders (82%). The scores for panic (55%), posttraumatic stress (52%), mood (50%), somatic symptom (50%), anxiety (35%), and substance use disorders (33%) were lower, likely because of overlapping symptoms. The overall classification accuracy (49%) was reasonable for a 7-class classifier.
A classification model was developed that could screen text for multiple mental health disorders. The screener resulted in an additional outcome score that may serve as input for a formal diagnostic interview and referral. This may lead to a more efficient and standardized intake process.
文本挖掘和机器学习在精神卫生保健实践与研究中的应用日益广泛,有可能节省患者诊断和监测的时间与精力。先前的研究表明,可以基于文本检测精神障碍,但这些研究聚焦于筛查单一的预定义障碍,而非同时筛查多种障碍。
本研究旨在开发一种荷兰语多类别文本分类模型,以筛查一系列精神障碍,从而将新患者转诊至最合适的治疗方案。
基于患者(N = 5863)对目前用于接诊和转诊的问卷的文本回复,开发了一种7类别分类器,用于区分焦虑症、恐慌症、创伤后应激障碍、情绪障碍、饮食障碍、物质使用障碍和躯体症状障碍。使用嵌套交叉验证网格搜索拟合线性支持向量机。
饮食障碍的分类率最高(82%)。恐慌症(55%)、创伤后应激障碍(52%)、情绪障碍(50%)、躯体症状障碍(50%)、焦虑症(35%)和物质使用障碍(33%)的得分较低,可能是由于症状重叠。对于一个7类别分类器而言,总体分类准确率(49%)较为合理。
开发了一种能够对多种精神健康障碍进行文本筛查的分类模型。该筛查工具产生了一个额外的结果分数,可作为正式诊断访谈和转诊的输入。这可能会带来更高效、标准化的接诊流程。